Andreas Rheinhardt:
> Ronald S. Bultje:
>> Fixes #11456.
>> ---
>>  libavcodec/threadprogress.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/libavcodec/threadprogress.c b/libavcodec/threadprogress.c
>> index 62c4fd898b..aa72ff80e7 100644
>> --- a/libavcodec/threadprogress.c
>> +++ b/libavcodec/threadprogress.c
>> @@ -55,9 +55,8 @@ void ff_thread_progress_report(ThreadProgress *pro, int n)
>>      if (atomic_load_explicit(&pro->progress, memory_order_relaxed) >= n)
>>          return;
>>  
>> -    atomic_store_explicit(&pro->progress, n, memory_order_release);
>> -
>>      ff_mutex_lock(&pro->progress_mutex);
>> +    atomic_store_explicit(&pro->progress, n, memory_order_release);
>>      ff_cond_broadcast(&pro->progress_cond);
>>      ff_mutex_unlock(&pro->progress_mutex);
>>  }
> 
> I don't really understand why this is supposed to fix a race; after all,
> the synchronisation of ff_thread_progress_(report|await) is not supposed
> to be provided by the mutex (which is avoided altogether in the fast
> path in ff_thread_report_await()), but by storing and loading the
> progress variable.
> That's also the reason why I moved this outside of the mutex (compared
> to ff_thread_report_progress(). (This way it is possible for a consumer
> thread to see the new progress value earlier and possibly avoid the
> mutex altogether.)
> 

Damn, this optimization works, but only if the progress variable is
always read with acquire-semantics; it is currently read via
memory_order_relaxed inside the mutex (just like in
ff_thread_await_progress()).

According to my understanding, this is what happens:
Consumer thread waits for progress and finds that it is insufficient
(fast path fails)
Producer thread updates progress variable
Consumer thread acquires the mutex and reads new progress via
memory_order_relaxed
Producer thread acquires mutex and broadcasts the new progress

I'd prefer to change these semantics so that we always perform
synchronisation via the atomic progress variable (unless you know of a
performance impact -- I only know that on x86, both memory_order_relaxed
and memory_order_acquire are ordinary loads).

Thanks for looking into this.

- Andreas

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to