On Wed, Feb 13, 2013 at 11:56:25AM -0800, Linus Torvalds wrote:
 > On Wed, Feb 13, 2013 at 11:34 AM, Dave Jones <da...@redhat.com> wrote:
 > >
 > > My test was a loop of 100 suspend/resume cycles before calling something
 > > 'good'. The 'bad' cases all failed within 10 cycles (usually 2-3).
 > 
 > Considering that you apparently already found one case where the BIOS
 > crapped out due to effectively unrelated timing details (ie timing
 > triggered a temperature issue that then triggered behavioral changes),
 > I wonder if your more occasional problem might not be a sign of
 > something similar.
 > 
 > But since you seem to be able to automate it well, maybe one thing to
 > try is to change the timing a bit while testing. Maybe some failures
 > were hidden by the timing just happening to work out.

Given I never saw this on a Fedora kernel, just my self-built ones, I eventually
gave up on bisecting code, and switched to bisecting config options.
I should have started this way, as I figured it out within an hour.

3.7 merge window is when I started seeing this, and here's what got introduced
during that time..

commit e3ebfb96f396731ca2d0b108785d5da31b53ab00
Author: Paul E. McKenney <paul.mcken...@linaro.org>
Date:   Mon Jul 2 14:42:01 2012 -0700

    rcu: Add PROVE_RCU_DELAY to provoke difficult races

'difficult' is an understatement.  This explains why some of those 'good'
bisects survived 100 suspends on one day, and failed the next.

Unfortunatly, I don't think there's any sane way to retrieve whatever debug
info might be getting spewed.  Perhaps when I reinstall, and switch to booting 
EFI
I'll be able to use pstore, but on a bios-based boot, all hope seems lost.
No netconsole, no usb-serial, even crippling i915's suspend routine doesn't 
help.

I'll just disable this option for now.

        Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to