Apologies if I'm hitting the wrong mailing list. long time user, first time 
reporter and all that.

recently my system has been suffering from instability with the graphics 
system. essentially some application on my system is causing oom for graphics 
memory.
normally I'd just expect a hard crash of the application in such a scenario. 
instead the system enters a spin loop of command submissions,
slows down dramatically generally resulting in the system freezing up.

There are a couple issues I'd like to point out with the current situation I'm 
experiencing:

- most importantly the error message doesn't provide any useful information for 
tracing the source of the issue. no pid, or other diagnostic information.
- its very noisy when trying to debug. I can occasionally drop my system to a 
separate TTY and the message just spams the entire screen. making it impossible 
to interact with my system even if I wanted to load up debugging tools to 
analyze the situation.

given the error message I believe this line is the source of the log statement.
[drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Not enough memory for command 
submission!​
https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c#L1431

Generally I'm wondering if there is anything that can be done to improve the 
experience for end users in such a scenario.

Ideally the system would nuke the misbehaving process similar to how ram ooms 
are handled.

but at a minimum I'd like to be able to figure out how to back track this to 
the misbehaving process. any help in this regard would be appreciated.

Sent with [Proton Mail](https://proton.me/) secure email.

Reply via email to