-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello everybody,

last week we experienced a severe outage due to a crashed zpool. I'm now
in the process of investigating the reason for the crash, to prevernt it
in the future. May be some of the people with more experience are able
to help me....

The setup:

- - Sun Fire X4270 with 16GB RAM running Opensolaris 2009.06 acting as
samba PDC and NIS/NFS server for some 400 users.

- - sas_zpool built of 24x 300GB SAS disks (4x raidz) in JBOD, 2x 32GB SSD
(mirror) for zfs log, 1x  160 GB SSD for zfs cache

- - bulk_pool containing 42x 1TB SATA/SAS disks in 2 JBODS


the machine worked several month without a problem. A week ago we added
the last set of 6 disks to the sas_pool.


What happened:

the server became unavailable, obviously it had crashed and wrote a
kernel core dump.

After rebooting the machine the server crashed again (core dumping)
while trying to mount the zfs filessytems (home directories) from the
sas_pool.

We booted single user and checked the zpool status. The sas_pool was
degraded with a failed SSD disk in the log mirror. We replaced the
failed disk and waited until the resilvering process had finished (took
some 4 hours). zpool status for the pool was fine after that. Rebooting
the machine in multi user mode resulted in the same core dump like before.

Fortunately we had a rsync mirror of our home directories (second 4270
with a bunch of SATA JBODs). We finally mounted the spare machine via
NFS instead of the crashed pool to keep services running.


What might be the reason?

- - the failed SSD (shouldn't harm as it is mirrored)
- - not enough RAM causing the crash, damaging the zpool


Is there any chance to reanimate the crashed pool, otherwise we need to
build the pool from scratch and rsync from the fallback (this will take
several days)




Thanks in advance for any suggestions



Carsten





- --
Max Planck Institut fuer marine Mikrobiologie
- - Network Administration -
Celsiustr. 1
D-28359 Bremen
Tel.: +49 421 2028568
Fax.: +49 421 2028565
PGP public key:http://www.mpi-bremen.de/Carsten_John.html
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFLi3sosRCwZeehufsRAqXmAKDg2KoR1exq4jTMkiR8iBt+xsDW1QCgjsrO
mK4uYJec0A3oO1kQCyM9XFQ=
=icmr
-----END PGP SIGNATURE-----
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to