Re: Make sure we populate the initroot filesystem late enough

2007-03-15 Thread Paul TBBle Hampson
On Tue, Mar 13, 2007 at 08:03:49AM +0100, Benjamin Herrenschmidt wrote: >> Hmm. The crash came back after I booted into Mac OS X and back. It was >> however >> a different crash, I believe it was coming from the USB modules (as it would >> keep going when it happened, and get another crash, which

Re: Make sure we populate the initroot filesystem late enough

2007-03-12 Thread Benjamin Herrenschmidt
> Hmm. The crash came back after I booted into Mac OS X and back. It was however > a different crash, I believe it was coming from the USB modules (as it would > keep going when it happened, and get another crash, which tended to scroll > away > too fast for me to capture) but I believe it was st

Re: Make sure we populate the initroot filesystem late enough

2007-03-12 Thread Kumar Gala
On Mar 12, 2007, at 6:01 PM, Paul TBBle Hampson wrote: On Thu, Mar 01, 2007 at 09:30:56AM +0900, Michael Ellerman wrote: On Wed, 2007-02-28 at 10:13 +, David Woodhouse wrote: On Wed, 2007-02-28 at 07:43 +0100, Benjamin Herrenschmidt wrote: I wouldn't be that sure ... I've had problems in

Re: Make sure we populate the initroot filesystem late enough

2007-03-12 Thread Paul TBBle Hampson
On Thu, Mar 01, 2007 at 09:30:56AM +0900, Michael Ellerman wrote: > On Wed, 2007-02-28 at 10:13 +, David Woodhouse wrote: >> On Wed, 2007-02-28 at 07:43 +0100, Benjamin Herrenschmidt wrote: > >> I wouldn't be that sure ... I've had problems in the past with PMU based > >> cpufreq... looks like

Re: Make sure we populate the initroot filesystem late enough

2007-02-28 Thread Michael Ellerman
On Wed, 2007-02-28 at 10:13 +, David Woodhouse wrote: > On Wed, 2007-02-28 at 07:43 +0100, Benjamin Herrenschmidt wrote: > > I wouldn't be that sure ... I've had problems in the past with PMU based > > cpufreq... looks like flushing all caches and hard-resetting the > > processor on the fly whe

Re: Make sure we populate the initroot filesystem late enough

2007-02-28 Thread David Woodhouse
On Wed, 2007-02-28 at 07:43 +0100, Benjamin Herrenschmidt wrote: > I wouldn't be that sure ... I've had problems in the past with PMU based > cpufreq... looks like flushing all caches and hard-resetting the > processor on the fly when there can be pending DMAs might be a source of > trouble... espe

Re: Make sure we populate the initroot filesystem late enough

2007-02-27 Thread Benjamin Herrenschmidt
> It's most likely a red herring, lots of config changes > make the bug go away on some kernel versions (but not > on others); the problem is very sensitive to changes in > memory layout. I wouldn't be that sure ... I've had problems in the past with PMU based cpufreq... looks like flushing all c

Re: Make sure we populate the initroot filesystem late enough

2007-02-27 Thread Segher Boessenkool
I've not been able to reproduce it since, but I know others (BCC'ed on this note) have seen it and might prod them to come forth with details (and broken .config files) In my case, disabling CPU_FREQ_PMAC made the failure go away. After reverting this patch, CPU_FREQ_PMAC is once again operati

Re: Make sure we populate the initroot filesystem late enough

2007-02-26 Thread Benjamin Herrenschmidt
> > I've not been able to reproduce it since, but I know others (BCC'ed on > > this note) have seen it and might prod them to come forth with details > > (and broken .config files) > > In my case, disabling CPU_FREQ_PMAC made the failure go away. > After reverting this patch, CPU_FREQ_PMAC is onc

Re: Make sure we populate the initroot filesystem late enough

2007-02-26 Thread Benjamin Herrenschmidt
> USB controller issues? We used to have these really hard-to-debug problems > with the USB controller being active and having had the BIOS set up the > command queues etc. Really subtle. It's why we now have PCI quirks for > shutting up (most) USB controllers very early. On powermacs or power

Re: Make sure we populate the initroot filesystem late enough

2007-02-26 Thread Paul TBBle Hampson
On Mon, Feb 26, 2007 at 11:27:47AM -0800, john stultz wrote: > On Sun, 2007-02-25 at 19:00 -0500, David Woodhouse wrote: >> On Mon, 2006-12-11 at 20:59 +, Linux Kernel Mailing List wrote: > >> > >> Make sure we populate the initroot filesystem late enough >> This seems to be what's trigge

Re: Make sure we populate the initroot filesystem late enough

2007-02-26 Thread Linus Torvalds
On Mon, 26 Feb 2007, David Woodhouse wrote: > > Now I'm starting to wonder if it's something the firmware sets up to DMA > to a certain region of memory, which makes it non-deterministic. And the > other things we're blaming are only making a difference because they > change the layout of what w

Re: Make sure we populate the initroot filesystem late enough

2007-02-26 Thread David Woodhouse
On Mon, 2007-02-26 at 10:44 -0600, Milton Miller wrote: > Any chance you are using one of the unusal code paths, like the > bootloader moving the initrd or using a kernel crash region? I'm doing nothing special. And I'm less sure now about the trigger. I built a Fedora 7 test 2 install tree with

Re: Make sure we populate the initroot filesystem late enough

2007-02-26 Thread Kumar Gala
On Feb 26, 2007, at 9:51 AM, Benjamin Herrenschmidt wrote: On Sun, 2007-02-25 at 20:17 -0500, David Woodhouse wrote: On Sun, 2007-02-25 at 16:24 -0800, Linus Torvalds wrote: Hmm. No, I don't think that should be a problem. free_initmem() only happens at the very, after do_basic_setup() has be

Re: Make sure we populate the initroot filesystem late enough

2007-02-26 Thread john stultz
On Sun, 2007-02-25 at 19:00 -0500, David Woodhouse wrote: > On Mon, 2006-12-11 at 20:59 +, Linux Kernel Mailing List wrote: > > > > Make sure we populate the initroot filesystem late enough > > This seems to be what's triggering the apparent memory corruption we've > been seeing recently -

Re: Make sure we populate the initroot filesystem late enough

2007-02-26 Thread Milton Miller
On Feb 27, 2007, at 2:24 AM, David Woodhouse wrote: On Sun, 2007-02-25 at 20:13 -0800, Linus Torvalds wrote: On Sun, 25 Feb 2007, David Woodhouse wrote: Can you try adding something like memset(start, 0xf0, end - start); Yeah, I did that before giving up on it for the day and going i

Re: Make sure we populate the initroot filesystem late enough

2007-02-26 Thread David Woodhouse
On Sun, 2007-02-25 at 20:13 -0800, Linus Torvalds wrote: > > On Sun, 25 Feb 2007, David Woodhouse wrote: > > > > > Can you try adding something like > > > > > > memset(start, 0xf0, end - start); > > > > Yeah, I did that before giving up on it for the day and going in search > > of dinner

Re: Make sure we populate the initroot filesystem late enough

2007-02-26 Thread Segher Boessenkool
And check that we didn't end up stupidly having the initrd share a page with something else ... (like not aligned end or such thingy). David tested that yesterday, it's not the case. Too bad, would have been too easy ;-) Segher - To unsubscribe from this list: send the line "unsubscribe linu

Re: Make sure we populate the initroot filesystem late enough

2007-02-26 Thread Benjamin Herrenschmidt
On Sun, 2007-02-25 at 23:01 -0500, David Woodhouse wrote: > Yeah, I did that before giving up on it for the day and going in search > of dinner. It changes the failure mode to a BUG() in > cache_free_debugcheck(), at line 2876 of mm/slab.c > > It smells like the pages weren't actually reserved in

Re: Make sure we populate the initroot filesystem late enough

2007-02-26 Thread Benjamin Herrenschmidt
On Sun, 2007-02-25 at 20:17 -0500, David Woodhouse wrote: > On Sun, 2007-02-25 at 16:24 -0800, Linus Torvalds wrote: > > Hmm. No, I don't think that should be a problem. free_initmem() only > > happens at the very, after do_basic_setup() has been run, which > > includes all the initcall stuff. >

Re: Make sure we populate the initroot filesystem late enough

2007-02-25 Thread William Lee Irwin III
On Sun, Feb 25, 2007 at 11:01:06PM -0500, David Woodhouse wrote: > Yeah, I did that before giving up on it for the day and going in search > of dinner. It changes the failure mode to a BUG() in > cache_free_debugcheck(), at line 2876 of mm/slab.c > It smells like the pages weren't actually reserved

Re: Make sure we populate the initroot filesystem late enough

2007-02-25 Thread Linus Torvalds
On Sun, 25 Feb 2007, David Woodhouse wrote: > > > Can you try adding something like > > > > memset(start, 0xf0, end - start); > > Yeah, I did that before giving up on it for the day and going in search > of dinner. It changes the failure mode to a BUG() in > cache_free_debugcheck(), at

Re: Make sure we populate the initroot filesystem late enough

2007-02-25 Thread David Woodhouse
On Sun, 2007-02-25 at 19:45 -0800, Linus Torvalds wrote: > Ok. Clearly something is using that memory. That said, I *suspect* that > the commit that you bisected to is just showing the problem indirectly. > The ordering shouldn't make any difference, but it can obviously make a > huge difference

Re: Make sure we populate the initroot filesystem late enough

2007-02-25 Thread Linus Torvalds
On Sun, 25 Feb 2007, David Woodhouse wrote: > > I'm inclined to agree that it _shouldn't_ be a problem. Nevertheless, > even this hack seems sufficient to 'fix' it: Ok. Clearly something is using that memory. That said, I *suspect* that the commit that you bisected to is just showing the probl

Re: Make sure we populate the initroot filesystem late enough

2007-02-25 Thread David Woodhouse
On Sun, 2007-02-25 at 16:24 -0800, Linus Torvalds wrote: > Hmm. No, I don't think that should be a problem. free_initmem() only > happens at the very, after do_basic_setup() has been run, which > includes all the initcall stuff. I'm inclined to agree that it _shouldn't_ be a problem. Nevertheless

Re: Make sure we populate the initroot filesystem late enough

2007-02-25 Thread David Woodhouse
On Sun, 2007-02-25 at 16:24 -0800, Linus Torvalds wrote: > Hmm. No, I don't think that should be a problem. free_initmem() only > happens at the very, after do_basic_setup() has been run, which includes > all the initcall stuff. > However, it's an interesting observation. How sure are you that i

Re: Make sure we populate the initroot filesystem late enough

2007-02-25 Thread Linus Torvalds
On Sun, 25 Feb 2007, David Woodhouse wrote: > > One side-effect of this patch is to move the call to free_initrd() much > later in the init sequence, potentially after other memory management > code is assuming it's already been freed. Hmm. No, I don't think that should be a problem. free_initm

Re: Make sure we populate the initroot filesystem late enough

2007-02-25 Thread David Woodhouse
On Mon, 2006-12-11 at 20:59 +, Linux Kernel Mailing List wrote: > Gitweb: > http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=8d610dd52dd1da696e199e4b4545f33a2a5de5c6 > Commit: 8d610dd52dd1da696e199e4b4545f33a2a5de5c6 > Parent: 8993780a6e44fb4e7ed34e33