On 2023-03-22 04:02, Ondřej Surý via lttng-dev wrote:
Hi,
this happens with all the patches fully applied and doesn't seem to be caused
by anything I am doing :)
WARNING: ThreadSanitizer: data race (pid=3995296)
Write of size 8 at 0x7fb51e5fd048 by thread T296:
#0 __tsan_memset <null> (badcache_test+0x49257d) (BuildId:
166dea93b2dca28264fc85c79b56d116cd491ed7)
#1 urcu_mb_synchronize_rcu
/home/ondrej/Projects/userspace-rcu/src/urcu.c:412:2 (liburcu-mb.so.8+0x35e0)
(BuildId: c613f5290cb1c2233fc80714aec4bd742c418823)
#2 call_rcu_thread
/home/ondrej/Projects/userspace-rcu/src/../src/urcu-call-rcu-impl.h:381:4
(liburcu-mb.so.8+0x9c38) (BuildId: c613f5290cb1c2233fc80714aec4bd742c418823)
Previous atomic write of size 4 at 0x7fb51e5fd048 by thread T295:
#0 urcu_adaptative_wake_up
/home/ondrej/Projects/userspace-rcu/src/../src/urcu-wait.h:138:2
(liburcu-mb.so.8+0x8cb9) (BuildId: c613f5290cb1c2233fc80714aec4bd742c418823)
#1 urcu_wake_all_waiters
/home/ondrej/Projects/userspace-rcu/src/../src/urcu-wait.h:214:3
(liburcu-mb.so.8+0x41de) (BuildId: c613f5290cb1c2233fc80714aec4bd742c418823)
#2 urcu_mb_synchronize_rcu
/home/ondrej/Projects/userspace-rcu/src/urcu.c:522:2 (liburcu-mb.so.8+0x3766)
(BuildId: c613f5290cb1c2233fc80714aec4bd742c418823)
#3 call_rcu_thread
/home/ondrej/Projects/userspace-rcu/src/../src/urcu-call-rcu-impl.h:381:4
(liburcu-mb.so.8+0x9c38) (BuildId: c613f5290cb1c2233fc80714aec4bd742c418823)
Location is stack of thread T296.
Thread T296 (tid=3995606, running) created by thread T272 at:
#0 pthread_create <null> (badcache_test+0x44d5fb) (BuildId:
166dea93b2dca28264fc85c79b56d116cd491ed7)
#1 call_rcu_data_init
/home/ondrej/Projects/userspace-rcu/src/../src/urcu-call-rcu-impl.h:460:8
(liburcu-mb.so.8+0x5b26) (BuildId: c613f5290cb1c2233fc80714aec4bd742c418823)
#2 __create_call_rcu_data
/home/ondrej/Projects/userspace-rcu/src/../src/urcu-call-rcu-impl.h:514:2
(liburcu-mb.so.8+0x53b5) (BuildId: c613f5290cb1c2233fc80714aec4bd742c418823)
#3 urcu_mb_create_call_rcu_data
/home/ondrej/Projects/userspace-rcu/src/../src/urcu-call-rcu-impl.h:524:9
(liburcu-mb.so.8+0x52bd) (BuildId: c613f5290cb1c2233fc80714aec4bd742c418823)
#4 loop_run /home/ondrej/Projects/bind9/lib/isc/loop.c:293:31
(libisc-9.19.12-dev.so+0x7a0a0) (BuildId:
a33cd26e483b73684928b4782627f1278c001605)
#5 loop_thread /home/ondrej/Projects/bind9/lib/isc/loop.c:327:2
(libisc-9.19.12-dev.so+0x77890) (BuildId:
a33cd26e483b73684928b4782627f1278c001605)
#6 isc__trampoline_run
/home/ondrej/Projects/bind9/lib/isc/trampoline.c:202:11
(libisc-9.19.12-dev.so+0xaa6be) (BuildId:
a33cd26e483b73684928b4782627f1278c001605)
Thread T295 (tid=3995605, running) created by thread T261 at:
#0 pthread_create <null> (badcache_test+0x44d5fb) (BuildId:
166dea93b2dca28264fc85c79b56d116cd491ed7)
#1 call_rcu_data_init
/home/ondrej/Projects/userspace-rcu/src/../src/urcu-call-rcu-impl.h:460:8
(liburcu-mb.so.8+0x5b26) (BuildId: c613f5290cb1c2233fc80714aec4bd742c418823)
#2 __create_call_rcu_data
/home/ondrej/Projects/userspace-rcu/src/../src/urcu-call-rcu-impl.h:514:2
(liburcu-mb.so.8+0x53b5) (BuildId: c613f5290cb1c2233fc80714aec4bd742c418823)
#3 urcu_mb_create_call_rcu_data
/home/ondrej/Projects/userspace-rcu/src/../src/urcu-call-rcu-impl.h:524:9
(liburcu-mb.so.8+0x52bd) (BuildId: c613f5290cb1c2233fc80714aec4bd742c418823)
#4 loop_run /home/ondrej/Projects/bind9/lib/isc/loop.c:293:31
(libisc-9.19.12-dev.so+0x7a0a0) (BuildId:
a33cd26e483b73684928b4782627f1278c001605)
#5 loop_thread /home/ondrej/Projects/bind9/lib/isc/loop.c:327:2
(libisc-9.19.12-dev.so+0x77890) (BuildId:
a33cd26e483b73684928b4782627f1278c001605)
#6 isc__trampoline_run
/home/ondrej/Projects/bind9/lib/isc/trampoline.c:202:11
(libisc-9.19.12-dev.so+0xaa6be) (BuildId:
a33cd26e483b73684928b4782627f1278c001605)
SUMMARY: ThreadSanitizer: data race
(/home/ondrej/Projects/bind9/tests/dns/.libs/badcache_test+0x49257d) (BuildId:
166dea93b2dca28264fc85c79b56d116cd491ed7) in __tsan_memset
This is between:
- DEFINE_URCU_WAIT_NODE(wait, URCU_WAIT_WAITING);
and
- uatomic_or(&wait->state, URCU_WAIT_TEARDOWN);
That's pretty much weird because the "Write" happens on stack local variable,
Yes, it's the initialization of this "wait state" variable, which is located on
the stack of the thread being blocked.
At initialization time, there are no concurrent accesses to this variable.
while the "Previous write" happens after futex,
That "previous" write was clearly for an unrelated prior blocking of the same thread.
Basically what is missing here is information about the lifetime of this urcu_wait_node object.
After urcu_adaptative_busy_wait() has observed the state (uatomic_read(&wait->state) &
URCU_WAIT_TEARDOWN), it knows that the object has no concurrent user anymore, which means the thread
can move forward and reclaim this area of the stack for other uses.
So somehow we should add an annotation about the lifetime of this object, which begins with
DEFINE_URCU_WAIT_QUEUE() and ends right after
"urcu_posix_assert(uatomic_read(&wait->state) & URCU_WAIT_TEARDOWN);".
Thanks,
Mathieu
which lead me to the fact that
ThreadSanitizer doesn't intercept futex, but we can annotate the futexes:
https://groups.google.com/g/thread-sanitizer/c/T0G_NyyZ3s4
Oh boy...
Ondrej
--
Ondřej Surý (He/Him)
ond...@sury.org
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev