Testing of rcutorture's SRCU-P scenario on a large arm64 system resulted in rcu_torture_writer() forward-progress failures, but these same tests passed on x86. After some off-list discussion of possible memory-ordering causes for these failures, Boqun showed that these were in fact due to reordering, but by the scheduler, not by the memory system. On x86, rcu_torture_writer() would have run quickly enough that by the time the rcu_torture_updown() kthread started, the rcu_torture_current variable would already be initialized, thus avoiding a bug in which a NULL value would cause rcu_torture_updown() to do an extra call to srcu_up_read_fast().
This commit therefore moves creation of the rcu_torture_writer() kthread after that of the rcu_torture_reader() kthreads. This results in deterministic failures on x86. What about the double-srcu_up_read_fast() bug? Boqun has the fix. But let's also fix the test while we are at it! Reported-by: Joel Fernandes <joelagn...@nvidia.com> Reported-by: Boqun Feng <boqun.f...@gmail.com> Signed-off-by: Paul E. McKenney <paul...@kernel.org> --- kernel/rcu/rcutorture.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index d94b24f19cf59..62f082e24d3b9 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -4476,11 +4476,6 @@ rcu_torture_init(void) /* Start up the kthreads. */ rcu_torture_write_types(); - firsterr = torture_create_kthread(rcu_torture_writer, NULL, - writer_task); - if (torture_init_error(firsterr)) - goto unwind; - if (nrealfakewriters > 0) { fakewriter_tasks = kcalloc(nrealfakewriters, sizeof(fakewriter_tasks[0]), @@ -4516,6 +4511,11 @@ rcu_torture_init(void) firsterr = rcu_torture_updown_init(); if (torture_init_error(firsterr)) goto unwind; + firsterr = torture_create_kthread(rcu_torture_writer, NULL, + writer_task); + if (torture_init_error(firsterr)) + goto unwind; + nrealnocbers = nocbs_nthreads; if (WARN_ON(nrealnocbers < 0)) nrealnocbers = 1; -- 2.40.1