Hello,
Not sure if it's worth troubleshooting this too much before upgrading, but
we recently had an 8.1R/amd64 box hang in a way that suggested everything
was waiting on disk access. It's remote and we had to resort to a
power-cycle to bring it back (we have serial console, but it hung after
accepting the root password).
We run hourly/daily/weekly/monthly snapshots on about a half dozen
filesystems using RSE's snaphot script
(see http://people.freebsd.org/~rse/snapshot/ - we only use the zfs
snapshotting and do not use the amd portion). We have some basic stats
logged on all our boxes every 5 minutes and I saw a pile of cron jobs
stuck in disk I/O wait. I suspect these were the snapshots. Shortly
after that it seems as if all disk I/O got hung.
Some additional info about what the main tasks are on this box:
-qmail deliveries (lots)
-postgres (light use)
-nfs export of qmail log dirs to another box that does log analysis
All services are spread amongst a handful of jails. Each jail has it's
out zfs filesystem.
Does this sound familiar to anyone running ZFS with snapshots? Anything I
should log to get more data if this happens again? I have output from
arc_summary.pl running every 5 minutes as part of our general status
logging.
Any pointers to known issues in ZFS (both 8.1 an 8.2) would be helpful.
Also, anywhere to look for the general state of ZFS besides this page?
http://wiki.freebsd.org/ZFS
Thanks,
Charles
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"