Hi, On Fri, Feb 7, 2025 at 10:50 PM Zhao Zhili < quinkblack-at-foxmail....@ffmpeg.org> wrote:
> > > > On Feb 8, 2025, at 00:05, Ronald S. Bultje <rsbul...@gmail.com> wrote: > > > > Hi, > > > > On Fri, Feb 7, 2025 at 8:44 AM Zhao Zhili < > > quinkblack-at-foxmail....@ffmpeg.org> wrote: > > > >>> On Feb 7, 2025, at 21:26, Ronald S. Bultje <rsbul...@gmail.com> wrote: > >>> On Fri, Feb 7, 2025 at 6:22 AM Andreas Rheinhardt < > >>> andreas.rheinha...@outlook.com> wrote: > >>>> Ronald S. Bultje: > >>>>> Fixes #11456. > >>>>> --- > >>>>> libavcodec/threadprogress.c | 3 +-- > >>>>> 1 file changed, 1 insertion(+), 2 deletions(-) > >>>>> > >>>>> diff --git a/libavcodec/threadprogress.c > b/libavcodec/threadprogress.c > >>>>> index 62c4fd898b..aa72ff80e7 100644 > >>>>> --- a/libavcodec/threadprogress.c > >>>>> +++ b/libavcodec/threadprogress.c > >>>>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, > >>>> int n) > >>>>> if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= > >> n) > >>>>> return; > >>>>> > >>>>> - atomic_store_explicit(&pro->progress, n, memory_order_release); > >>>>> - > >>>>> ff_mutex_lock(&pro->progress_mutex); > >>>>> + atomic_store_explicit(&pro->progress, n, memory_order_release); > >>>>> ff_cond_broadcast(&pro->progress_cond); > >>>>> ff_mutex_unlock(&pro->progress_mutex); > >>>>> } > >>>> > >>>> I don't really understand why this is supposed to fix a race; after > all, > >>>> the synchronisation of ff_thread_progress_(report|await) is not > supposed > >>>> to be provided by the mutex (which is avoided altogether in the fast > >>>> path in ff_thread_report_await()), but by storing and loading the > >>>> progress variable. > >>>> That's also the reason why I moved this outside of the mutex (compared > >>>> to ff_thread_report_progress(). (This way it is possible for a > consumer > >>>> thread to see the new progress value earlier and possibly avoid the > >>>> mutex altogether.) > >>> > >>> > >>> The consumer thread already checks the value without the lock. so the > >>> significance of that last point seems minor to me. This would be > >> different > >>> if the wait() counterpart had no lockless path. Or am I missing > >> something? > >> > >> What Andreas says is atomic_store before mutex_lock makes the first > >> atomic_load in progress_wait has a higher chance to succeed. The earlier > >> progress is set, the higher chance of progress_wait go into the fast > path. > > > > > > I understand that is true in theory - but I have doubts on whether this > is > > in any way significant in practice if wait() already has behaviour to > > pre-empty locklessly > > > > I measured this in the most un-scientific way possible by decoding > > gizmo.webm (from Firefox' bug report) 10x before and after my patch, > taking > > the average and standard deviation, and comparing these with each other. > I > > repeated this a couple of times. The values (before vs after avg +/- > > stddev) are obviously never exactly the same, but they swarm around each > > other like a random noise generator. Or to say it differently: in my > highly > > unscientific test, I see no performance difference. > > > > So ... Is this really worth it? > > I did another test by measure fast_path / (fast_path + slow_path) on macOS > of hevc decoding with 10 threads. > > 1. Before the patch, it’s 99.741%. > 2. With the patch, it’s 99.743%. > 3. With while (atomic_load_explicit(&pro->progress, memory_order_acquire) > < n), it’s 99.741%. > > So, it doesn’t matter for performance. Current patch is the most elegant > solution in my opinion. Thanks for testing. Andreas, any further thoughts? Ronald _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".