On 2018-04-23 18:03, Segher Boessenkool via cfarm-users wrote:
On Mon, Apr 23, 2018 at 08:05:09AM -0700, David Edelsohn via cfarm-users wrote:
I have reported the problem to OSU.

Don't know if it's limited to a filesystem corruption bug or symptom
of a hardware disk failure.

I kicked off all users and tried an xfs_repair.  xfs_repair -n finds
a lot of errors, but xfs_repair does not want to run because the device
is busy (although lsof claims it is not).

We probably need a reboot, and yes I fear hardware failure :-(


I believe more in a crash of xfs than a hardware failure (like we had on gcc118). According to lvm, system have multipath for storage and multipath looks like fine, root filesystem too.

I've forced a reboot and system came back online with /home mounted. dmesg show an xfs recovery.
[   12.608593] XFS (dm-6): Mounting V4 Filesystem
[   12.841363] XFS (dm-6): Starting recovery (logdev: internal)
[   29.126997] XFS (dm-6): Ending recovery (logdev: internal)

I perform an xfs_repair to be sure, xfs looks like ok… for now.

In case the error came back again, we may try to switch to ext4 (this will imply loosing all data).

Aymeric.
_______________________________________________
cfarm-users mailing list
cfarm-users@lists.tetaneutral.net
https://lists.tetaneutral.net/listinfo/cfarm-users

Reply via email to