On Tue, Jul 11, 2017 at 12:05 PM, Grazvydas Ignotas <nota...@gmail.com> wrote: > On Tue, Jul 11, 2017 at 12:21 AM, Marek Olšák <mar...@gmail.com> wrote: >> From: Marek Olšák <marek.ol...@amd.com> >> >> Consider the following situation: >> mtx_lock(mutex); >> do_something(); >> util_queue_add_job(...); >> mtx_unlock(mutex); >> >> If the queue is full, util_queue_add_job will wait for a free slot. >> If the job which is currently being executed tries to lock the mutex, >> it will be stuck forever, because util_queue_add_job is stuck. >> >> The deadlock can be trivially resolved by increasing the queue size >> (reallocating the queue) in util_queue_add_job if the queue is full. >> Then util_queue_add_job becomes wait-free. >> >> radeonsi will use it. > > Can't this cause the queue to grow uncontrollably, like on GPU hangs, > making already difficult to debug situations worse? Perhaps > util_queue_add_job() could have a non-blocking-fail option and the > caller could then retry after releasing the mutex for a bit.
The thing with GPU hangs is that the driver is unable to continue its operation and will be stuck one way or another. The caller can't release the mutex, because it has done an operation (do_something() above) that must be done together with util_queue_add_job and can't be separated. The atomicity of command submission starts with the first mtx_lock call. Things are irreversible after do_something(). The only two possible outcomes is that util_queue_add_job either succeeds or waits and then succeeds. There is no other option. Marek _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev