On Fri, May 10, 2019 at 02:08:19PM +0200, Peter Zijlstra wrote: > On Thu, May 09, 2019 at 12:36:25PM -0700, Paul E. McKenney wrote: > > I forward-ported the relevant patches from -rcu and placed them on -rcu > > branch peterz.2019.05.09a, and this is what produced the output above. > > > > Any other debugging thoughts? > > > > Or, if you wish, you can reproduce by running the following: > > > > nohup tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 8 --duration 2 > > --configs "TRIVIAL" --bootargs > > "trace_event=sched:sched_switch,sched:sched_wakeup ftrace=function_graph > > ftrace_graph_filter=sched_setaffinity,migration_cpu_stop" --kconfig > > "CONFIG_FUNCTION_TRACER=y CONFIG_FUNCTION_GRAPH_TRACER=y" > > > > This gets me the following summary output: > > > > --- Thu May 9 12:08:31 PDT 2019 Test summary: > > Results directory: > > /home/git/linux-2.6-tip/tools/testing/selftests/rcutorture/res/2019.05.09-12:08:31 > > tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 8 --duration 2 > > --configs TRIVIAL --bootargs > > trace_event=sched:sched_switch,sched:sched_wakeup ftrace=function_graph > > ftrace_graph_filter=sched_setaffinity,migration_cpu_stop --kconfig > > CONFIG_FUNCTION_TRACER=y CONFIG_FUNCTION_GRAPH_TRACER=y > > TRIVIAL ------- 2177 GPs (18.1417/s) [trivial: g0 f0x0 ] > > :CONFIG_HOTPLUG_CPU: improperly set > > WARNING: BAD SEQ 2176:2176 last:2177 version 4 > > > > /home/git/linux-2.6-tip/tools/testing/selftests/rcutorture/res/2019.05.09-12:08:31/TRIVIAL/console.log > > WARNING: Assertion failure in > > /home/git/linux-2.6-tip/tools/testing/selftests/rcutorture/res/2019.05.09-12:08:31/TRIVIAL/console.log > > WARNING: Summary: Warnings: 1 Bugs: 1 Call Traces: 5 Stalls: 8 > > So I could reproduce... > > I must first complain about your scripts; it does "make mrproper" on the > source tree every time you run it, this is not appreciated. For one, it > deletes my 'tags' file.
This is because it builds in a different directory, and "make O=/path" complains if you don't have the source directory pristine. But there really is no longer any reason to build in a different directory, I suppose. This is a largish change, but working on it. > Getting it to not rebuild the whole kernel every time wasn't easy > either. You trust "make" far more than I do! I am thinking of adding a "--trust-make" argument that suppresses the "make clean". Maybe if I grow to trust "make" in the fulness of time, I can remove the "make clean" entirely. But given ccache, and given the duration of the typical rcutorture run, and given that there are multiple rcutorture scenarios each with a different .config, this hasn't been a priority. The build step is already omitted for repeated runs. > Aside from that it seems to 'work'. > > The below trace explain the issue. Some Paul person did it, see below. > It's broken per construction :-) *facepalm* Hence the very strange ->cpus_allowed mask. I really should have figured that one out. The fix is straightforward. I just added "rcutorture.shuffle_interval=0" to the TRIVIAL.boot file, which stops rcutorture from shuffling its kthreads around. Please accept my apologies for the hassle, and thank you for tracking this down!!! Thanx, Paul