Frustrating day in Flame

Adding some useful detail and a utility to this discussion.

Flame has the resource manager, but it only is a snapshot for resources, it’s not a good real time monitor.

Turns out NVidia includes a good monitor. In a terminal run

watch -n 0.5 nvidia-smi

It produces and refreshed every .5 seconds this hardware dashboard of your GPU, which includes, temp, load, memory consumption, and memory consumption by process:

Interestingly enough while Flame is stable, it occupies the 19.3GB that is set as the threshold in Setup at 80% as instructed by support.

After 5 minutes when things start going south, you can see Flame breaking it’s own rule and GPU memory steadily increasing until it hits 24.0 GB and exhausts physical memory of the GPU. And then dies.

It also confirms that this memory full is related to the GPU, as we can see it in action.

2 Likes