On Thu, Dec 22, 2011 at 11:25 AM, Tim Cook <t...@cook.ms> wrote: > On Thu, Dec 22, 2011 at 10:00 AM, Myers Carpenter <my...@maski.org> wrote:
>> So the lesson here: Don't be a dumbass like me. Setup up nagios or some >> other system to alert you when a pool has become degraded. ZFS works very >> well with one drive out of the array, you aren't probably going to notice >> problems unless you are proactively looking for them. > Or, if you aren't scrubbing on a regular basis, just change your zpool > failmode property. Had you set it to wait or panic, it would've been very > clear, very quickly that something was wrong. > http://prefetch.net/blog/index.php/2008/03/01/configuring-zfs-to-gracefully-deal-with-failures/ I'm not sure this will help, as a single failed drive in a raidz1 or 2 in a raidz2 will make the zpool DEGRADED and not FAULTED. I believe this parameter governs behavior for a FAULTED zpool. We have a very simple shell script that runs hourly and does a `zpool status -x` and generates an email to the admins if any pool is in any state other than ONLINE. As soon as a zpool goes DEGRADED we get notified and can initiate the correct response (open a case with Oracle to replace the failed drive is the usual one). Here is the snippet from the script of the actual health check (not my code, I would have done it differently, but this works) ... not_ok=`${zfs_path}/zpool status -x | egrep -v "all pools are healthy|no pools available"` if [ "X${not_ok}" != "X" ] then fault_details="There is at least one zpool error." let fault_count=fault_count+1 new_faults[${fault_count}]=${fault_details} fi -- {--------1---------2---------3---------4---------5---------6---------7---------} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, RPI Players _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss