On Wednesday 17 April 2002 11:58 pm, Karsten M. Self wrote: > on Wed, Apr 17, 2002, Jamin W. Collins ([EMAIL PROTECTED]) wrote: > > On Wed, 17 Apr 2002 11:53:10 -0700 > > > > "Karsten M. Self" <kmself@ix.netcom.com> wrote: > > > on Wed, Apr 17, 2002, Jamin W. Collins ([EMAIL PROTECTED]) > > > > > > wrote: > > > > > - CPU -- a continuous kernel-build loop is a pretty good test. > > > > > You're looking for SIG-11 errors. > > > > > > > > I'll give that a try, how long would you suggest it run before > > > > considering this test to have been passed? > > > > > > Run through it one or more times. In some cases there are thermal > > > effects. Depending on processor speed, you may want to set a > > > continuous loop and run the process for an hour or so. > > > > Created two copies of the 2.4.16 kernel source and have currently been > > running two endless compiling loops in SSH sessions to the system. The > > loops have been running for 4+ hours now, planning to let them run over > > night. > > Sounds like a negative. > > > > You might try mounting your drives 'sync' (synchronous mode), and > > > launching Mozilla under strace, logging stderr. This may be able to > > > capture the final system calls of the program. > > > > I'll give this a run tomorrow. Hopefully I can readily get Mozilla to > > drop the system. > > You've largely eliminated memory and CPU. > > Another possible HW problem might be a disk corruption in your swap > partition. I suggest this just because I now that Mozilla tends to > grow, and stress swap. Though I would tend toward a driver issue. Not > sure of a good swap tester, anyone have any suggestions? >
as far as i can trace back, i can't see any precise description of the archtiecture on which this is happening to you. i've got an amd k6 chip where the same symptoms happened with damn near every kernel iteration between 2.2.19 up until the current 2.4.17. i traced my problem back to one point in the kernel source, namely at slab.c:1248, caused by, as far as my investigation could determine, an attempted allocation of a non-allocatable memory address. i tried to contact the author of that piece of the kernel code but ended up with an undeliverable message about a week later. contacting amd got a response that they did not have the resources to investigate linux-related bugs. in any case, since running 2.4.17 for about six months, now, the problem has never re-occured. in fact, i've been so relieved to have a crash free configuration that i've postponed the effort to test any kernel beyond the one that works. the symptoms were exactly the same: random system freezes, often within booting, regularly on running x, but always infrequent with no discernable pattern involved, and with nothing registered in dmesg or any of the appropriate x logs. i eventually found the clue in kern.log. the absence of the same begins with the very date that i compiled the 2.4.17 kernel from source from kernel.org, which was also actually the first kernel i compiled in the non-debian way. maybe that matters. in the event that this same issue is the cause of your grief, sorry i didn't file a bug report, back then. i guess i could be more sociable but i kind of lose myself in the task of curing the problem, and tend to forget that part of the allegiance to linux. ben -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]