Le 7 février 2025 13:53:22 GMT+02:00, Zhao Zhili <quinkbl...@foxmail.com> a 
écrit :
>
>
>> On Feb 7, 2025, at 19:46, Zhao Zhili <quinkblack-at-foxmail....@ffmpeg.org> 
>> wrote:
>> 
>> 
>> 
>>> On Feb 7, 2025, at 19:39, Andreas Rheinhardt 
>>> <andreas.rheinha...@outlook.com> wrote:
>>> 
>>> Andreas Rheinhardt:
>>>> Ronald S. Bultje:
>>>>> Fixes #11456.
>>>>> ---
>>>>> libavcodec/threadprogress.c | 3 +--
>>>>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>>>> 
>>>>> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c
>>>>> index 62c4fd898b..aa72ff80e7 100644
>>>>> --- a/libavcodec/threadprogress.c
>>>>> +++ b/libavcodec/threadprogress.c
>>>>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int 
>>>>> n)
>>>>>    if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n)
>>>>>        return;
>>>>> 
>>>>> -    atomic_store_explicit(&pro->progress, n, memory_order_release);
>>>>> -
>>>>>    ff_mutex_lock(&pro->progress_mutex);
>>>>> +    atomic_store_explicit(&pro->progress, n, memory_order_release);
>>>>>    ff_cond_broadcast(&pro->progress_cond);
>>>>>    ff_mutex_unlock(&pro->progress_mutex);
>>>>> }
>>>> 
>>>> I don't really understand why this is supposed to fix a race; after all,
>>>> the synchronisation of ff_thread_progress_(report|await) is not supposed
>>>> to be provided by the mutex (which is avoided altogether in the fast
>>>> path in ff_thread_report_await()), but by storing and loading the
>>>> progress variable.
>>>> That's also the reason why I moved this outside of the mutex (compared
>>>> to ff_thread_report_progress(). (This way it is possible for a consumer
>>>> thread to see the new progress value earlier and possibly avoid the
>>>> mutex altogether.)
>>>> 
>>> 
>>> Damn, this optimization works, but only if the progress variable is
>>> always read with acquire-semantics; it is currently read via
>>> memory_order_relaxed inside the mutex (just like in
>>> ff_thread_await_progress()).
>>> 
>>> According to my understanding, this is what happens:
>>> Consumer thread waits for progress and finds that it is insufficient
>>> (fast path fails)
>>> Producer thread updates progress variable
>>> Consumer thread acquires the mutex and reads new progress via
>>> memory_order_relaxed
>>> Producer thread acquires mutex and broadcasts the new progress
>>> 
>>> I'd prefer to change these semantics so that we always perform
>>> synchronisation via the atomic progress variable (unless you know of a
>>> performance impact -- I only know that on x86, both memory_order_relaxed
>>> and memory_order_acquire are ordinary loads).
>> 
>> I have considered the solution too, by always use memory_order_acquire
>> in wait progress. memory_order_relaxed is normal load on ARM, while
>> memory_order_acquire isn’t. So there is real difference.
>> 
>> https://developer.arm.com/documentation/dui0801/l/A64-Data-Transfer-Instructions/LDAPR--A64-
>> 
>> Now it’s weird to use memory_order_acquire inside mutex lock.
>
>cc Remi, who have written VLC atomic_wait and mutex from sketch.

It always gets weird when you mix atomics and CVs. CVs must nominally have some 
sort of associated state that is modified under the same lock. But in the 
preexisting code there is no such thing.

So yeah, if you need the performance properties (or the infallible 
initialisation, or the implicit clean-up) of atomics, then you really should 
use futeces, not CVs. Otherwise don't use atomics.

In this particular case, whether this is a false positive of TSan or a real bug 
depends on the behaviour of other code paths (which I have not had time to 
review as of yet).
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to