> On Feb 7, 2025, at 19:46, Zhao Zhili <quinkblack-at-foxmail....@ffmpeg.org> > wrote: > > > >> On Feb 7, 2025, at 19:39, Andreas Rheinhardt >> <andreas.rheinha...@outlook.com> wrote: >> >> Andreas Rheinhardt: >>> Ronald S. Bultje: >>>> Fixes #11456. >>>> --- >>>> libavcodec/threadprogress.c | 3 +-- >>>> 1 file changed, 1 insertion(+), 2 deletions(-) >>>> >>>> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c >>>> index 62c4fd898b..aa72ff80e7 100644 >>>> --- a/libavcodec/threadprogress.c >>>> +++ b/libavcodec/threadprogress.c >>>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int >>>> n) >>>> if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n) >>>> return; >>>> >>>> - atomic_store_explicit(&pro->progress, n, memory_order_release); >>>> - >>>> ff_mutex_lock(&pro->progress_mutex); >>>> + atomic_store_explicit(&pro->progress, n, memory_order_release); >>>> ff_cond_broadcast(&pro->progress_cond); >>>> ff_mutex_unlock(&pro->progress_mutex); >>>> } >>> >>> I don't really understand why this is supposed to fix a race; after all, >>> the synchronisation of ff_thread_progress_(report|await) is not supposed >>> to be provided by the mutex (which is avoided altogether in the fast >>> path in ff_thread_report_await()), but by storing and loading the >>> progress variable. >>> That's also the reason why I moved this outside of the mutex (compared >>> to ff_thread_report_progress(). (This way it is possible for a consumer >>> thread to see the new progress value earlier and possibly avoid the >>> mutex altogether.) >>> >> >> Damn, this optimization works, but only if the progress variable is >> always read with acquire-semantics; it is currently read via >> memory_order_relaxed inside the mutex (just like in >> ff_thread_await_progress()). >> >> According to my understanding, this is what happens: >> Consumer thread waits for progress and finds that it is insufficient >> (fast path fails) >> Producer thread updates progress variable >> Consumer thread acquires the mutex and reads new progress via >> memory_order_relaxed >> Producer thread acquires mutex and broadcasts the new progress >> >> I'd prefer to change these semantics so that we always perform >> synchronisation via the atomic progress variable (unless you know of a >> performance impact -- I only know that on x86, both memory_order_relaxed >> and memory_order_acquire are ordinary loads). > > I have considered the solution too, by always use memory_order_acquire > in wait progress. memory_order_relaxed is normal load on ARM, while > memory_order_acquire isn’t. So there is real difference. > > https://developer.arm.com/documentation/dui0801/l/A64-Data-Transfer-Instructions/LDAPR--A64- > > Now it’s weird to use memory_order_acquire inside mutex lock.
cc Remi, who have written VLC atomic_wait and mutex from sketch. > >> >> Thanks for looking into this. >> >> - Andreas >> >> _______________________________________________ >> ffmpeg-devel mailing list >> ffmpeg-devel@ffmpeg.org >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel >> >> To unsubscribe, visit link above, or email >> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".