Andreas Rheinhardt: > Ronald S. Bultje: >> Fixes #11456. >> --- >> libavcodec/threadprogress.c | 3 +-- >> 1 file changed, 1 insertion(+), 2 deletions(-) >> >> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c >> index 62c4fd898b..aa72ff80e7 100644 >> --- a/libavcodec/threadprogress.c >> +++ b/libavcodec/threadprogress.c >> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int n) >> if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n) >> return; >> >> - atomic_store_explicit(&pro->progress, n, memory_order_release); >> - >> ff_mutex_lock(&pro->progress_mutex); >> + atomic_store_explicit(&pro->progress, n, memory_order_release); >> ff_cond_broadcast(&pro->progress_cond); >> ff_mutex_unlock(&pro->progress_mutex); >> } > > I don't really understand why this is supposed to fix a race; after all, > the synchronisation of ff_thread_progress_(report|await) is not supposed > to be provided by the mutex (which is avoided altogether in the fast > path in ff_thread_report_await()), but by storing and loading the > progress variable. > That's also the reason why I moved this outside of the mutex (compared > to ff_thread_report_progress(). (This way it is possible for a consumer > thread to see the new progress value earlier and possibly avoid the > mutex altogether.) >
Damn, this optimization works, but only if the progress variable is always read with acquire-semantics; it is currently read via memory_order_relaxed inside the mutex (just like in ff_thread_await_progress()). According to my understanding, this is what happens: Consumer thread waits for progress and finds that it is insufficient (fast path fails) Producer thread updates progress variable Consumer thread acquires the mutex and reads new progress via memory_order_relaxed Producer thread acquires mutex and broadcasts the new progress I'd prefer to change these semantics so that we always perform synchronisation via the atomic progress variable (unless you know of a performance impact -- I only know that on x86, both memory_order_relaxed and memory_order_acquire are ordinary loads). Thanks for looking into this. - Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".