VRAM Filling Up, both cards

I’m working on batches that are pretty simple. Only thing is they each have ML Human Face extractions, cheeks, chin, etc. Usability of these nodes is kind of suspect so I’ve been putting them into BFx s. Makes their rendering permanent. Fewer issues.
Problem is is that VRAM is constantly filling up. And yes it is definitely due to these nodes. restarting is the only way to free it up.
Anyone gotten around this issue?

I’ve haven’t had this issue with these nodes, but there are known GPU memory leaks. And yes, the only option is to restart and file a support case. It may or may not be the same leak that others have experienced.

I’ve observed that as VRAM usage approaches capacity, certain nodes within Batch begin to behave unpredictably, occasionally resulting in application crashes. To mitigate potential data loss, I submitted a feature request for Flame to display a warning notification when VRAM usage reaches 75%, providing users an opportunity to save their work and restart the application.

Additionally, I proposed that this threshold be user-configurable, allowing artists to define their own preferred VRAM usage limits based on their workflow and system performance.

If this feature would be valuable to you as well, please consider supporting the request by upvoting it here:

FI-03458

2 Likes

You can set the V-Ram allocation in the setup utility. It defaults to automatic, but you can set it to 80% or whatever you prefer. Was instructed to do that by support when we were recently chasing a GPU memory leak.

The other useful thing - open a terminal window and type

watch -n 0.5 nvidia-smi

It will update every 1/2s with actual GPU memory usage from the GPU.

A lot more detail about all of this here.

In my experience Flame honors those limits, until something goes haywire and then it just blows right past them.

2 Likes

Thank you for sharing! My targets were already set to 45%. Flame consumes them, effortlessly. (eventually…)

Ah right - you were asking about an alert.

You might be able to write a python script that takes the output of nvidia-smi, looks for the utilization percentage and then beeps or does something to get your attention.

If you use

nvidia-smi -q --display=UTILIZATION

you get something that may be easier to parse.

With respect, the primary limitation of a Python script is that it’s only accessible to those who actively seek it out and download it. If the functionality were implemented natively by the development team, it would benefit a much broader group of users by being readily available to all.

2 Likes

Another nice tool for tracking your VRAM is nvtop- it gathers the same info as nvidia-smi but includes a graphical trace. That way you can just double-tap the meta key to peek your usage and be right back in flame with no disruption.

To install, just enter in a terminal “sudo dnf install nvtop”. (you might need “sudo dnf install -y epel-release” first)

5 Likes

Thanks to Jean Filiatrault who’s been working with me to help figure out where the problem is. So far no luck, but if anything turns up, I’ll make sure and post here.

1 Like

When I had the issues, I wasn’t using these nodes. So it could be unrelated. Or it could be a common code path that’s showing up in different places.

The defect # for my case is FLME-68999, just in case they can see some internal notes that may be helpful.

And fingers crossed for you. These situations are very tricky to track down.

1 Like