Re: [clamav-users] clamd exits with libclamav error

Török Edwin Wed, 19 Oct 2011 12:00:13 -0700

On 2011-10-19 21:53, Alex wrote:
> Hi,
> 
>>> kernel: [73788.355981] [Hardware Error]: Machine check events logged
>>> kernel: [73914.635576] CPU4: Package temperature above threshold, cpu
>>> clock throttled (total events = 5538406)
>>> kernel: [73914.635581] CPU0: Package temperature above threshold, cpu
>>> clock throttled (total events = 5538398)
>>
>> Since your CPU had thermal protection, it's supposed to take effect before 
>> the hardware is permanently damaged, but the thermal stress might have 
>> affected it, or other components like memory or the PSU.
> 
>>> [29016.445470] clamd[1110] general protection ip:30df2c3981
>>> sp:7fffa08f4fe0 error:0 in libclamav.so.6.1
>>> .11[30df200000+9ce000]
> 
> I've now switched the hard disks to the old server (also an x86_64
> arch) and it has been running fine with no 'general protection' errors
> for more than twelve hours. I think it's safe to assume there is no
> software bug causing these errors?
> 
> I've also been stress testing the new hardware separately. It
> succeeded through two full passes of memtest86 without any errors.
> It's now been running mprime for more than twelve hours and has not
> failed.
> 
> When these 'general protection' errors were produced, the system was
> typically under high load and high IO.
> 
> I realize this may be a hardware issue, but does anyone have any ideas
> how to determine what is really going on?


There are some packages for stress-testing, like cpuburn.
cpuburn in MMX mode is quite good at raising your CPU temperature, I suggest 
you keep
an eye on the CPU sensors (sensors -l) if you do run it.
Try running one cpuburn on each CPU core for a while.

Of course its also possible that your hardware was fine before and you'll 
damage it by running
the stress tests (if you have inadequate cooling for example), so you do so on 
your own risk!

> 
> Is there a way to stress-test clamav on the new hardware, to try and
> induce an error through high IO?

For high I/O try this: run updatedb to update your locate database,
and at the same time launch a clamd multiscan:
clamdscan -m /

Another test that you can do is to compile some large pieces of software (Linux 
kernel, OpenOffice, etc.)
with make -j N, where N = nr_cores * 2. GCC uses a _lot_ of pointer 
manipulation and will randomly
crash on faulty hardware, although in that case memtest usually detects the 
errors too.

Best regards,
--Edwin
_______________________________________________
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://www.clamav.net/support/ml

Re: [clamav-users] clamd exits with libclamav error

Reply via email to