On 6/6/2021 8:39 AM, Andreas Rheinhardt wrote:
James Almer:
On 6/5/2021 11:27 AM, Anton Khirnov wrote:
Memory ordering constraints other than the default (sequentially
consistent) can behave in very unintuitive and unexpected ways, and so
should be avoided unless there is a strong (typically performance)
reason for using them. This holds especially for memory_order_relaxed,
which imposes no ordering constraints.
Performance is important for FIFO, though. Could they use a laxer order
at least for the fetch_sub and fetch_add cases?
Also, commit fed50c4304e, which made cpu_flags an atomic type, states
that ThreadSanitizer stopped reporting races on tests/cpu_init after
that change, so maybe memory order is not important for it, only atomicity.
This is true: There can by definition be no data race involving an
atomic variable that is exclusively accessed via atomic operations: "The
execution of a program contains a data race if it contains two
conflicting actions in different threads, at least one of which is *not
atomic*, and neither happens before the other." (C11, 5.1.2.4.25)
Therefore fed50c4304eecb352e29ce789cdb96ea84d6162f and
a2a38b160620d91bc3f895dadc4501c589998b9c fixed data races (the latter
also a race condition); using sequentially consistent ordering meanwhile
brings no gain: It may provide synchronization between two threads
calling e.g. av_get_cpu_flags(), but the only effect of this is that
tsan might overlook buggy code that actually contains races. It notably
does not really establish an ordering: If two threads call
av_cpu_count() concurrently, then it is indetermined which of the two
threads will call av_log (if it has not already been called), but with
this patch it is certain that the second thread synchronizes with the
first thread and sees everything that the first thread has done so far
(which can mask bugs); and if we look at the test in
libavutil/tests/cpu_init.c, then all the possibilities that existed with
memory_order_relaxed also exist with memory_order_seq_cst:
- Thread 1 can read first, see that cpu_flags is uninitialized and
initialize it, followed by thread 2 seeing and reusing the already
initialized value.
- Thread 1 can read first, followed by thread 2 reading the value (with
both threads seeing that cpu_flags is uninitialized), followed by thread
1 storing its value, followed by thread 2 overwriting the value again;
the order of writes can also be the other way around.
- Same as the last one, but with the order of stores reversed.
- Exchanging the roles of thread 1 and thread 2 in the above scenarios
leads to the remaining cases.
In any case these cpu and mem functions are hardly a bottleneck
anywhere, and rarely called outside of debugging, so this change is ok.
Mem functions are rarely called outside of debugging?
I was specifically thinking about about av_max_alloc(), but overlooked
the fact max_alloc_size is read by av_malloc() an co, which are
obviously not for debug only. Sorry.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".