On Fri, Aug 24, 2012 at 3:34 PM, John Drescher <dresche...@gmail.com> wrote: > On Thu, Aug 23, 2012 at 1:34 PM, John Drescher <dresche...@gmail.com> wrote: >> Over the last few weeks I have done some reliability testing with >> mdraid6 on a machine with 2 lsi mptsas controllers and 13 SATA I >> drives. My testing involved physically hot removing a drive forcing >> the raid to grab a spare and rebuild. This worked great for the 5 or >> so times I did this on gentoo-sources-3.5.0 and lower. However any >> attempt to do this on gentoo-sources-3.5.1 or even the 3.6-rc2 git >> resulted in a total lockup of the array. I originally thought this was >> a mdadm regression and posted about that last week here: >> >> https://lkml.org/lkml/2012/8/17/503 >> > > I have bisected the kernel a few times now and the problem was > introduced between > 3.5.0.00007-ged29dbd > 3.5.0.00015-g4d9157e > > After the raid rebuilds again I will bisect again and see if I can > narrow it down to the exact patch. >
I have bisected it down to the following patch: Bisecting: 0 revisions left to test after this (roughly 0 steps) [10f8d5b86743b33d841a175303e2bf67fd620f42] SCSI: fix hot unplug vs async scan race It appears this patch caused the bad behavior although I have not tested that yet. I am rebuilding the array (takes ~2 hours) from the previous good bisect. -- John M. Drescher -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/