On Saturday, March 31, 2018 5:56:57 AM PDT Chris Wilson wrote: > Quoting Chris Wilson (2018-03-31 12:00:16) > > Quoting Kenneth Graunke (2018-03-30 19:20:57) > > > On Friday, March 30, 2018 7:40:13 AM PDT Chris Wilson wrote: > > > > For i915, we are proposing to use a quality-of-service parameter in > > > > addition to that of just a priority that usurps everyone. Due to our HW, > > > > preemption may not be immediate and will be forced to wait until an > > > > uncooperative process hits an arbitration point. To prevent that unduly > > > > impacting the privileged RealTime context, we back up the preemption > > > > request with a timeout to reset the GPU and forcibly evict the GPU hog > > > > in order to execute the new context. > > > > > > I am strongly against exposing this in general. Performing a GPU reset > > > in the middle of a batch can completely screw up whatever application > > > was running. If the application is using robustness extensions, we may > > > be forced to return GL_DEVICE_LOST, causing the application to have to > > > recreate their entire GL context and start over. If not, we may try to > > > let them limp on(*) - and hope they didn't get too badly damaged by some > > > of their commands not executing, or executing twice (if the kernel tries > > > to resubmit it). But it may very well cause the app to misrender, or > > > even crash. > > > > Yes, I think the revulsion has been universal. However, as a > > quality-of-service guarantee, I can understand the appeal. The > > difference is that instead of allowing a DoS for 6s or so as we > > currently allow, we allow that to be specified by the context. As it > > does allow one context to impact another, I want it locked down to > > privileged processes. I have been using CAP_SYS_ADMIN as the potential > > to do harm is even greater than exploiting the weak scheduler by > > changing priority.
Right...I was thinking perhaps a tunable to reduce the 6s would do the trick, and be much less complicated...but perhaps you want to let it go longer when there isn't super-critical work to do. > Also to add further insult to injury, we might want to force GPU clocks > to max for the RT context (so that the context starts executing at max > rather than wait for the system to upclock on load). Something like, That makes some sense - but I wonder if it wouldn't cause more battery burn than is necessary. The super-critical workload may also be relatively simple (redrawing a clock), and so up-clocking and down-clocking again might hurt us...it's hard to say. :( I also don't know what I think of this plan to let userspace control (restrict) the frequency. That's been restricted to root (via sysfs) in the past. But I think you're allowing it more generally now, without CAP_SYS_ADMIN? It seems like there's a lot of potential for abuse. (Hello, benchmark mode! Zoooom!) I know it solves a problem, but it seems like there's got to be a better way... --Ken
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev