Melameth, Daniel D. wrote:
> The bottom of your dmesg appears to indicate your HD is dying--act fast.

Naw.  That's a UDMA error.  That's also a chip which I think is prone to
doing EXACTLY that...downgrading from UDMA2 UDMA1.  I have a laptop
which does the exact same thing at boot, with the same chip (I think).

(though if it prompts you to make a backup, ignore what I just said. :)


> Simon Morgan wrote:
>> This morning my server started rebooting itself constantly for about
>> 15 
>> minutes. Although the last log seems to indicate that at least 1 crash
>> occured, no core dumps are to be found in /var/crash and I couldn't
>> find anything pertinent in the system logs. I'm thinking flaky
>> hardware 
>> but, short of plugging in a monitor and sitting watching it, is there
>> anything I can do to trace the source of the problem? I ran memtest86+
>> on the machine not too long ago and I'll do so again in the meantime.
...

I'd be looking for a HW problem, possibly external to the computer.
I've seen bad UPSs do that so many times, I quit recommending them to my
clients -- they were less reliable than local power (though..that seems
to be changing lately...and unfortunately, that does not mean UPSs are
more reliable, just that the power is less reliable).  Give the average
UPS a dead battery or dead inverter, they will change a tiny,
insignificant power glitch into a two-second power outage and boom, your
computer reboots.  A cheap digital clock plugged into the same outlet as
your computer is a great diagnostic tool (make sure it resets after a
very brief power glitch, some of them take many seconds to reset).

BTW: As I've probably said here before, memtest86 is a good program, but
be aware of what any diagnostic can do: IF it tells you you have a
problem, you probably have a problem.  If it says "no problem found", it
may have just been unable to find one.  I've actually never had
memtest86 tell me something I didn't already know, though I had one case
where putting some incompatable RAM in a machine would keep it from
booting... BUT memtest86 could be (and was!) run for days on it and
NEVER find a problem.  There are a lot of things other than the memory
that can cause problems (power supply leaps into mind).

>> pciide0 at pci0 dev 15 function 0 "Acer Labs M5229 UDMA IDE" rev
>> 0xc1: DMA, channel 0 configured to compatibility, channel 1
>> configured to compatibility wd0 at pciide0 channel 0 drive 0: <Maxtor
>> 92041U4>  
>> wd0: 16-sector PIO, LBA, 19541MB, 40020624 sectors
>> wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2
>> atapiscsi0 at pciide0 channel 1 drive 0
>> scsibus0 at atapiscsi0: 2 targets
>> cd0 at scsibus0 targ 0 lun 0: <MITSUMI, CR-4802TE, 2.1D> SCSI0
>> 5/cdrom removable 
>> cd0(pciide0:1:0): using PIO mode 3, DMA mode 1
...
>> wd0: soft error (corrected)
>> wd0: transfer error, downgrading to Ultra-DMA mode 1
>> wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 1
>> wd0a: DMA error reading fsbn 41824 of 41824-41855 (wd0 bn 41887; cn
>> 41 tn 8 sn 55), retrying 
>> wd0: soft error (corrected)

yeah, that's a not-abnormal UDMA "problem" on that chip (Acer M5229), I
do believe...

Nick.

Reply via email to