I have a 7.1 system that gets a lot of disk I/O activity.  It's an
outgoing mail server that handles all of our hosted mailing lists.  Two
of it's file systems seems to suffering from severe corruption after
several days of uptime.  The /var/log filesystem I can stop all
services, unmount, run e2fsck on it and remount, but the main file
system (/) is the one I'm having a bit of trouble with.  Lately it is
starting to get more and more corruption: duplicate blocks, lost inodes,
etc., etc., on it.

    There are three drives in the system, and usually the one mounted as
/var/log is the first one to start showing problems (/dev/hdc1), and
generally / follows (/dev/hda1) and eventually my /usr/local/src will be
join the party (/dev/hdb1).

    Could this be an indication that the onboard controller is starting
to fail?  Specially since it's now also starting on a third file system
(these are all different drives, 3 in total)?

    Another application that tells me every day that something changed
is tripwire.  Sometimes it reports entire directories that apparently
disappeared, however they're still there and all data is accounted for.
The next day it'll come back with no errors.  Sometimes it'll report
binaries that have changed, but verifying those with a clean (CD)
version shows it's still the same.

    I'm lost.  Should I start looking for my sledgehammer and deem this
box a lost cause?

--
W | I haven't lost my mind; it's backed up on tape somewhere.
  +--------------------------------------------------------------------
  Ashley M. Kirchner <mailto:[EMAIL PROTECTED]>   .   303.442.6410 x130
  IT Director / SysAdmin / WebSmith             .     800.441.3873 x130
  Photo Craft Laboratories, Inc.            .     3550 Arapahoe Ave. #6
  http://www.pcraft.com ..... .  .    .       Boulder, CO 80303, U.S.A.




_______________________________________________
Redhat-list mailing list
[EMAIL PROTECTED]
https://listman.redhat.com/mailman/listinfo/redhat-list

Reply via email to