Summary: For those who didn't follow up the thread, I was investigating an error message: "Kernel panic - not syncing: Aiee, killing interrupt handler." where the computer comes to a complete freeze, the only thing that works is the power switch.
The error appears only under heavy load like compiling. This is a new box Asus A8V, AMD64, 1Gb or RAM (PC3200 DDR400 Kingston RAM) and Sata 200Gb I was able to find out that "Aiee" is a hardware error, Intel has a nice article about it: http://resource.intel.com/telecom/support/tnotes/tnbyos/2000/tn062.htm So following this lead I was looking and trying to pin-point hardware error. It took me one week to investigate trying different solutions like: 1.) I Run memtest86 first time, got some errors, so I run the same test on individual sticks (I have 2 x 512Mb), the individual sticks passed the test without errors. I exchanged the sticks between two slots and run the memtest86 again overnight. The test completed 17-passes without any error. So I excluded Memory as a culprit. 2.) I disabled Network controller on the motherboard and installed another one on PCI bus - this eliminated possible IRQ conflict, the Sata Drive on channel-0 was sharing an IRQ with Network controller. But it didn't help. 3.) I removed the heatsink, cleaned it with 99% isopropyl alcohol and applied a thin layer of new heatsink grease. Did not help. But I still wanted to try as per Robert C. suggestion: "some arctic silver compound instead. It's good for a 3-5C. drop from the regular stuff." Anyhow, I opened the box cover, and the temp. of the CPU dropped from about 40C to about 35C / 36C so I decided to follow some other leads first. 4.) I removed SATA drive and tried to install Gentoo on standard IDE drive; this would eliminate SCSI problem and/or buggy driver. Did not help, I haven't had a chance to do a complete base installation when I got the same error message: "Kernel panic - not syncing: Aiee, killing interrupt handler." I got a lead from Francesco T. ''Sometimes memtest doesn't stress enough the hardware, see: http://people.redhat.com/dledford/memtest.html ..." So it made me think again about the memory. I swapped the two sticks with the two sticks from one of my Backup Server PC2100 2x512Mb So I downloaded some linux source kernel but it needs to be modified as the Red Hat memtest.sh is looking for "linux" top-level directory not some "linux-2.6.-something". Instead of modifying the script it is easier to just modify the kernel-source (as per Richard F help): tar -xzvf linux.tar.gz mv linux-* linux tar -czvf linux.tar.gz linux and one more thing, change the first line of the script: #!/bin/bash2 to: #!/bin/bash I run the RedHad memory test on my main server (different box 20-passes standard script setup) and it went just fine. It finished with an empty line "no error" as weg-page suggest: ---quote---- How do you know if your memory passed? Very simple. If you run that script from the command line on your computer and it completes without ever spewing a single message onto your screen, then you passed. If you get messages from diff about differences between files or any other anomolies such as that, then you failed. ---end quote----- I run some compiling and did not receive any errors or kernel panic I did run the RedHat memory test on the memory stick from my backup server and it finished without spilling a single error message. So, at this point I know the problem is the memory stick I put back the original memory stick, the Sata Drive, and used the on board Network controller. I tried to run the RedHad memtest.sh it freeze with the same kernel panic: "Kernel panic - not syncing: Aiee, killing interrupt handler." It appears that the test only made into fourth-round when it freeze. It did not spill any message into the screen it just freeze with the kernel panic as always. So I wasn't 100% sure that this would qualify as failed memory test: "...f you get messages from diff about differences between files or any other anomolies such as that, then you failed." But I suppose, it would qualify, you be the judge. Anyhow, I replaced the pair of stick with two new once run memtest.sh 30-passes it passed without spilling single "error" on the the line, clean finish. I was able to emerge "kde-meta" and it finished without a single hiccup. Thank you ALL for all your suggestions help, it appears another mystery has been solved. So my conclusion: Do not rely on memtest86 -- #Joseph On Sat, 2005-07-23 at 20:23 +0200, Richard Fish wrote: > Joseph wrote: > > >>[...] > >> > >> > >>>-bash: ./memtest.sh: /bin/bash2: bad interpreter: No such file or > >>>directory > >>> > >>>On both boxes the I have bash-3.0 so what is it looking for? > >>> > >>> > >>Correct the first line of the script from "#!/bin/bash2" to > >>"#!/bin/bash" and everything will be fine. > >> > >>Ciao > >> Francesco > >> > >> > > > >Thank you, yes that is what I did as soon as I posted the message. > >Though it puzzle me whey it runs on my main server and not on the new > >box? > > > >Ps. my mean server pass the memtest.sh without any errors, I'm only > >curious the result of that bad rum sticks that pass memtest86 on the > >new box. I will re-run both test and post the results. > > > > > > My guess is still that if you relax the memory timings in the BIOS, the > "bad" RAM will start to work fine. Of course, *I* would still return it > and get RAM that actually performs to the specs on the box, but that's > just me! :-> > > -Richard > > -- gentoo-user@gentoo.org mailing list