On 10.03.20 17:37, Jan Beulich wrote:
On 10.03.2020 17:34, Jürgen Groß wrote:
On 10.03.20 17:29, Jan Beulich wrote:
On 10.03.2020 08:28, Juergen Gross wrote:
+void rcu_barrier(void)
{
- atomic_t cpu_count = ATOMIC_INIT(0);
- return stop_machine_run(rcu_barrier_action, &cpu_count, NR_CPUS);
+ unsigned int n_cpus;
+
+ while ( !get_cpu_maps() )
+ {
+ process_pending_softirqs();
+ if ( !atomic_read(&cpu_count) )
+ return;
+
+ cpu_relax();
+ }
+
+ n_cpus = num_online_cpus();
+
+ if ( atomic_cmpxchg(&cpu_count, 0, n_cpus) == 0 )
+ {
+ atomic_add(n_cpus, &done_count);
+ cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ);
+ }
+
+ while ( atomic_read(&done_count) )
Don't you leave a window for races here, in that done_count
gets set to non-zero only after setting cpu_count? A CPU
losing the cmpxchg attempt above may observe done_count
still being zero, and hence exit without waiting for the
count to actually _drop_ to zero.
This can only be a cpu not having joined the barrier handling, so it
will do that later.
I'm afraid I don't understand - if two CPUs independently call
rcu_barrier(), neither should fall through here without waiting
at all, I would think?
Oh, good catch!
I have thought more about this problem and I think using counters only
for doing rendezvous accounting is rather risky. I'll have a try using
a cpumask instead.
Juergen
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel