Re: [lttng-dev] ThreadSanitizer: data race between urcu_mb_synchronize_rcu and urcu_adaptative_wake_up

Mathieu Desnoyers via lttng-dev Wed, 22 Mar 2023 07:57:58 -0700

On 2023-03-22 07:01, Ondřej Surý via lttng-dev wrote:

On 22. 3. 2023, at 9:02, Ondřej Surý via lttng-dev <lttng-dev@lists.lttng.org> 
wrote:


That's pretty much weird because the "Write" happens on stack local variable,
while the "Previous write" happens after futex, which lead me to the fact that
ThreadSanitizer doesn't intercept futex, but we can annotate the futexes:

https://groups.google.com/g/thread-sanitizer/c/T0G_NyyZ3s4


FTR neither annotating the futex with __tsan_acquire(addr) and 
__tsan_release(addr)
nor falling back to compat_futex_async() for ThreadSanitizer has helped.

It seems to me that TSAN still doesn't understand the synchronization between
RCU read-critical sections and call_rcu/synchronize_rcu() as I am also getting
following reports:

   Write of size 8 at 0x7b54000009c0 by thread T102:
     #0 __tsan_memset <null> (badcache_test+0x49257d) (BuildId: 
a7c1595d61e3ee411276cf89a536a8daefa959a3)
     #1 mem_put /home/ondrej/Projects/bind9/lib/isc/mem.c:324:3 
(libisc-9.19.12-dev.so+0x7d136) (BuildId: 
a33cd26e483b73684928b4782627f1278c001605)
     #2 isc__mem_put /home/ondrej/Projects/bind9/lib/isc/mem.c:684:2 
(libisc-9.19.12-dev.so+0x7e0c3) (BuildId: 
a33cd26e483b73684928b4782627f1278c001605)
     #3 bcentry_destroy_rcu 
/home/ondrej/Projects/bind9/lib/dns/badcache.c:163:2 
(libdns-9.19.12-dev.so+0x4e071) (BuildId: 
8a550b795003cd1075ff29590734c806d84e76e6)
     #4 call_rcu_thread 
/home/ondrej/Projects/userspace-rcu/src/../src/urcu-call-rcu-impl.h:389:5 
(liburcu-mb.so.8+0x9d6b) (BuildId: d4f5ea9d96625c7b7d2b2efb590b208f7b83cb6f)

   Previous atomic write of size 8 at 0x7b54000009c0 by main thread (mutexes: 
write M0):
     #0 ___cds_wfcq_append 
/home/ondrej/Projects/userspace-rcu/src/../include/urcu/static/wfcqueue.h:202:2 
(liburcu-mb.so.8+0xa8ae) (BuildId: d4f5ea9d96625c7b7d2b2efb590b208f7b83cb6f)
     #1 _cds_wfcq_enqueue 
/home/ondrej/Projects/userspace-rcu/src/../include/urcu/static/wfcqueue.h:223:9 
(liburcu-mb.so.8+0xac09) (BuildId: d4f5ea9d96625c7b7d2b2efb590b208f7b83cb6f)
     #2 _call_rcu 
/home/ondrej/Projects/userspace-rcu/src/../src/urcu-call-rcu-impl.h:719:2 
(liburcu-mb.so.8+0x604f) (BuildId: d4f5ea9d96625c7b7d2b2efb590b208f7b83cb6f)
     #3 urcu_mb_barrier 
/home/ondrej/Projects/userspace-rcu/src/../src/urcu-call-rcu-impl.h:932:3 
(liburcu-mb.so.8+0x4d1b) (BuildId: d4f5ea9d96625c7b7d2b2efb590b208f7b83cb6f)
     #4 badcache_flush /home/ondrej/Projects/bind9/lib/dns/badcache.c:329:2 
(libdns-9.19.12-dev.so+0x4d8b3) (BuildId: 
8a550b795003cd1075ff29590734c806d84e76e6)
     [...]

E.g. ThreadSanitizer reports a race between a place where bcentry->rcu_head is 
added to call_rcu() queue
and when call_rcu callbacks are called.  Annotating the bcentry with 
acquire/release here helps with this
particular data race, but it does not feel right to me to add annotation at 
this level.

The code is not very complicated there:

static void
bcentry_destroy_rcu(struct rcu_head *rcu_head) {
         dns_bcentry_t *bad = caa_container_of(rcu_head, dns_bcentry_t,
                                               rcu_head);
         /* __tsan_release(bad); <-- this helps */
         dns_badcache_t *bc = bad->bc;

         isc_mem_put(bc->mctx, bad, sizeof(*bad));

         dns_badcache_detach(&bc);
}

static void
bcentry_evict(struct cds_lfht *ht, dns_bcentry_t *bad) {
         /* There can be multiple deleters now */
         if (cds_lfht_del(ht, &bad->ht_node) == 0) {
                 /* __tsan_acquire(bad); <- this helps */
                 call_rcu(&bad->rcu_head, bcentry_destroy_rcu);
         }
}

static void
badcache_flush(dns_badcache_t *bc, struct cds_lfht *ht) {
         struct cds_lfht *oldht = rcu_xchg_pointer(&bc->ht, ht);

         synchronize_rcu();

         rcu_read_lock();
         dns_bcentry_t *bad = NULL;
         struct cds_lfht_iter iter;
         cds_lfht_for_each_entry (oldht, &iter, bad, ht_node) {
                 bcentry_evict(oldht, bad);
         }
         rcu_read_unlock();
         rcu_barrier();
         RUNTIME_CHECK(cds_lfht_destroy(oldht, NULL) == 0);
}

Any ideas?


I suspect what happens here is that TSAN is considering the

        /*
         * Implicit memory barrier after uatomic_xchg() orders store to
         * q->tail before store to old_tail->next.
         *
         * At this point, dequeuers see a NULL tail->p->next, which
         * indicates that the queue is being appended to. The following
         * store will append "node" to the queue from a dequeuer
         * perspective.
         */
        CMM_STORE_SHARED(old_tail->next, new_head);

within ___cds_wfcq_append() as racy.

This pairs with:

/*
 * Waiting for enqueuer to complete enqueue and return the next node.
 */
static inline struct cds_wfcq_node *
___cds_wfcq_node_sync_next(struct cds_wfcq_node *node, int blocking)
{
        struct cds_wfcq_node *next;
        int attempt = 0;

        /*
         * Adaptative busy-looping waiting for enqueuer to complete enqueue.
         */
        while ((next = CMM_LOAD_SHARED(node->next)) == NULL) {
                if (___cds_wfcq_busy_wait(&attempt, blocking))
                        return CDS_WFCQ_WOULDBLOCK;
        }

        return next;
}

So the release semantic is provided by the implicit SEQ_CST barrier in:

___cds_wfcq_append():
  old_tail = uatomic_xchg(&tail->p, new_tail); (release)

and the acquire semantic is provided by the implicit SEQ_CST barrier in:

___cds_wfcq_splice():

        /*
         * Memory barrier implied before uatomic_xchg() orders store to
         * src_q->head before store to src_q->tail. This is required by
         * concurrent enqueue on src_q, which exchanges the tail before
         * updating the previous tail's next pointer.
         */
        tail = uatomic_xchg(&src_q_tail->p, &src_q_head->node);

Notice how the release/acquire semantic is provided by tail->p, which is 
atomically modified _before_ we set the node->next pointer.

With this information, is there a specific annotation that would make sense ?

Thanks,

Mathieu


Ondrej
--
Ondřej Surý (He/Him)
ond...@sury.org

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] ThreadSanitizer: data race between urcu_mb_synchronize_rcu and urcu_adaptative_wake_up

Reply via email to