Ingo Molnar wrote: > * Nick Piggin <[EMAIL PROTECTED]> wrote: > > >> On Thursday 15 November 2007 21:43, Ingo Molnar wrote: >> >>> * David Miller <[EMAIL PROTECTED]> wrote: >>> >>>> From: Matt Mackall <[EMAIL PROTECTED]> >>>> Date: Wed, 14 Nov 2007 17:37:13 -0600 >>>> >>>> >>>>> No, the usual strategy for debugging problems -outside- SLOB is to >>>>> switch to another allocator with more extensive debugging facilities. >>>>> >>>> Ok, so the thing we still can do is do a dump_stack() at the list >>>> debugging assertion trigger points. >>>> >>> ok, i'll first try to trigger it again. >>> >> I had implemented SLOB in userspace, so I resynched and think I found >> your problem. Sorry for the attachment format -- this mailer isn't the >> best. I'm really computer illiterate when it comes to userspace... >> > > thx, i'll try your fix in a minute. > > >> Anyway, I'm really happy to see you're testing and using SLOB upstream >> :) Is there any particular reason that you're using it? >> > > i sometimes test SLOB for -rt, but this time it's the result of my > "automated random QA" effort, as part of arch/x86 maintainance/QA. > > the main trick is to build and booting random "make randconfig" > bzImages. That finds build bugs and a good deal of boot hang and crash > bugs as well. (it also found a compiler bug already) I can build and > boot about 1000 random kernels in 24 hours, and it's all fully > automated. I usually run it overnight - when a kernel does not come up > due to a bootup hang or crash (or the kernel log signals any exception > condition) then the script stops and i can fix it in the morning. > > The first step towards this was to get allyesconfig bzImage kernels to > build and boot fine. That effort took months (we had many problems in > this area) - i think you saw bugreports and fixes from me about that on > lkml. > > Once that worked reasonably well i made a small Kconfig patch that > forcibly selects a "minimum set" of drivers and kernel subsystems that > are needed to boot up a testsystem. Once a "make allnoconfig" and a > "make allyesconfig" bzImage kernel boots up fine on the testbox all > randconfig configs "inbetween" are supposed to build and boot fine as > well. > > I also have a patch that adds all the x86 boot options like nosmp, > maxcpus=1, nohz=off, hpet=disable to be selectable as .config options - > so those boot options are randomized as well. > > I also have a small patch that disables half a dozen drivers/features > that are not expected to work out of box in a bzImage kernel. (such as > ISA drivers that assume the presence of hardware, or root filesystem > features such as NFSROOT) > > the resulting make randconfig kernel still has 99% of the degrees of > freedom that a stock make randconfig kernel has, so by all practical > purposes it's a fully random kernel - it just happens to boot on my > testsystem all the time. > > A successful bootup means the test system is able to boot up into a > stock Fedora 8 userspace and is able to bring up its network interfaces > and ssh out (automatically) to the build box to signal the completion of > a successful test cycle. The logs are also analyzed for lockdep > assertions (if lockdep is enabled - which it is in about 20% of the > randconfig kernels) and other kernel bugs. > > (just in case you were wondering about one of the reasons why the > arch/x86 unification merge went so smoothly, with nary a regression ;-) > Thomas is doing other types of automated QA of the x86 queue as well.) > > this method found the SG-list corruption bugs the following night after > Linus committed Jen's SG-list changes, so it's pretty good at finding > regressions as early as possible. > > Ingo >
How complete is the QA testing? I was reading this interesting thread and it occurred to me that this sounds like a useful distributed computing application. ie a central server with all valid Kconfig combinations (how many are there?) for a particular release (-rc or otherwise) across all architectures. These are allocated to clients on request to be built / booted etc. Any errors are fed back to the central server. I guess this would be a useful resource for developers. More importantly (and I don't know if this is the case already!) a new Linux release (2.6.x) could be "certified" with some level of testing on known hardware / architectures. tbh, I feel sorry for Ingo's machine compiling 1000 random kernels in 24h! I'm surprised it hasn't called the Samaritans... Dave. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/