Hi,

>> kernel: [73788.355981] [Hardware Error]: Machine check events logged
>> kernel: [73914.635576] CPU4: Package temperature above threshold, cpu
>> clock throttled (total events = 5538406)
>> kernel: [73914.635581] CPU0: Package temperature above threshold, cpu
>> clock throttled (total events = 5538398)
>
> Since your CPU had thermal protection, it's supposed to take effect before 
> the hardware is permanently damaged, but the thermal stress might have 
> affected it, or other components like memory or the PSU.

>> [29016.445470] clamd[1110] general protection ip:30df2c3981
>> sp:7fffa08f4fe0 error:0 in libclamav.so.6.1
>> .11[30df200000+9ce000]

I've now switched the hard disks to the old server (also an x86_64
arch) and it has been running fine with no 'general protection' errors
for more than twelve hours. I think it's safe to assume there is no
software bug causing these errors?

I've also been stress testing the new hardware separately. It
succeeded through two full passes of memtest86 without any errors.
It's now been running mprime for more than twelve hours and has not
failed.

When these 'general protection' errors were produced, the system was
typically under high load and high IO.

I realize this may be a hardware issue, but does anyone have any ideas
how to determine what is really going on?

Is there a way to stress-test clamav on the new hardware, to try and
induce an error through high IO?

Thanks,
Alex
_______________________________________________
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml

Reply via email to