------- Comment From balb...@au1.ibm.com 2016-07-17 21:43 EDT------- I tried the kernel at http://people.canonical.com/~kamal/lp1573062/lp1573062.1/ and it worked fine for me
------- Comment From balb...@au1.ibm.com 2016-07-19 01:04 EDT------- Looks like I got a failure with the run on http://people.canonical.com/~kamal/lp1573062/lp1573062.1/ But with my diff + 4.4.0 source from apt-source, I can always get the the following command to succeed. timeout -s 9 $end_time stress-ng --aggressive --verify --timeout $runtime --brk 0 I've tried three times with my diff (all success) and twice with the kernel @ ~kamal (one failure and one success). I've not tried the longer 7 hour run ------- Comment From balb...@au1.ibm.com 2016-07-19 01:37 EDT------- In the kern.log posted, it looks like the problem has moved to rwsem_wake+0xcc/0x110 up_write+0x78/0x90 unlink_anon_vmas+0x15c/0x2c0 A bunch of threads are stuck on rwsem_wake -- spinning on the sem->wait_lock. I can see a whole bunch of exiting stress-ng-mmapf stuck on this lock, spinning. I'll double check this. Can we get a build with lockdep enabled? I am unable to reproduce this issue at my end with the diff applied on my machine at the moment ------- Comment From balb...@au1.ibm.com 2016-07-19 19:51 EDT------- I am cloning the sources to debug further ------- Comment From balb...@au1.ibm.com 2016-07-19 23:52 EDT------- I cloned the kernel from https://git.launchpad.net/~kamalmostafa/ubuntu/+source/linux/+git/xenial/log/?h=lp1573062 and built with the machine config specified from /boot/config. I also verified the diff matches my changes. I ran timeout -s 9 $end_time stress-ng --aggressive --verify --timeout $runtime --brk 0 twice Both the times, the test did the right thing. Could someone verify if (a) The smaller subset works fine? (b) The larger test fails, if so, can we get a run with lockdep I was just testing for the command line above and I could see a difference with those patches. ------- Comment From balb...@au1.ibm.com 2016-07-21 20:14 EDT------- No, the diff matches, sorry for the confusion, but here is what I said "I also verified the diff matches my changes" In summary, here is what I did 1. cloned the sources 2. built locally on my machine 3. Ran stress-ng with recommended parameters 4. The test succeeded, got back the console Did four runs and I got back the console each time However with the provided binaries Step 3 (stress-ng) failed for me once in two runs ------- Comment From balb...@au1.ibm.com 2016-07-25 08:08 EDT------- Strange, I am able to reproduce the issue with the provided binaries, but not when I build it. I am not doing a deb build, but just a make -j64 with the config from /boot for 4.4.0-28. The problem could be at my end, but I am a little concerned. I also noticed that if I am interacting with the system during runs, it succeeds, frequently checking if the console is active (enters and control-o-h). I am going to see if I can get a repro again and debug further. ------- Comment From balb...@au1.ibm.com 2016-07-25 09:09 EDT------- In the meanwhile, any updates on the bisect? I was hoping we could do both things (RCA and bisect) in parallel Thanks, Balbir -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1573062 Title: memory_stress_ng failing for IBM Power S812LC(TN71-BP012) for 16.04 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs