Hi,

On 2012-06-06 20:50:17 +0100, Steven Chamberlain wrote:
> This could be a symptom of bad RAM in the machine.  (Although from the
> system info, it sounds like you might have ECC RAM).

When there are RAM problems, there are error messages in the
/var/log/kern.log file, like: ECC/ChipKill ECC error. But here,
the sysadmin said that there were no such error messages.

> Have you considered checking it, e.g. with memtest86+?  If you can't
> take the machine offline, memtester might give you some idea (as root,
> test an amount approx. equal to the amount reported as free+cache).

I'll ask the sysadmin.

> Otherwise, you could maybe disable your swap partitions and run
> badblocks on them in case something gets corrupted after being written
> out / read back in from disk?

The machine has 80 GB (currently only 3 GB are used). I suppose that
swap is never used. At least, according to atop:

MEM | tot   78.8G | free   51.8G | cache  23.6G | buff  831.9M | slab    1.8G |
SWP | tot    9.8G | free    9.8G |              | vmcom 650.4M | vmlim  49.2G |

MEM | tot   78.8G | free   51.9G | cache  23.7G | buff  832.0M | slab    1.8G |
SWP | tot    9.8G | free    9.8G |              | vmcom 546.3M | vmlim  49.2G |

at 16:02 and 16:12 respectively (the problem occurred at 16:08).

Note: I'm running the same test in a loop on 3 machines (including
this one where the problem occurred).

-- 
Vincent Lefèvre <vinc...@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



--
To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to