Re: [zfs-discuss] ZFS tale of woe and fail

2009-08-21 Thread Ross
It was blogged about on Joyent Tim: http://www.joyent.com/joyeurblog/2008/01/16/strongspace-and-bingodisk-update/ http://bugs.opensolaris.org/view_bug.do?bug_id=6458218 -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@

Re: [zfs-discuss] ZFS tale of woe and fail

2009-08-20 Thread Tom Bird
Ross wrote: Yup, that one was down to a known (and fixed) bug though, so it isn't the normal story of ZFS problems. Got a bug ID or anything for that, just out of interest? As an update on my storage situation, I've got some JBODs now, see how that goes. -- Tom // www.portfast.co.uk -- int

Re: [zfs-discuss] ZFS tale of woe and fail

2009-08-14 Thread Ross
Yup, that one was down to a known (and fixed) bug though, so it isn't the normal story of ZFS problems. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs

Re: [zfs-discuss] ZFS tale of woe and fail

2009-08-14 Thread David Magda
On Fri, August 14, 2009 09:02, Tom Bird wrote: > I can't remember how many errors the check found, however all the data > copied off successfully, as far as we know. I would think that you'd be fairly confident of the integrity of the data since everything would be checksummed. Joyent also had a

Re: [zfs-discuss] ZFS tale of woe and fail

2009-08-14 Thread Tom Bird
Victor Latushkin wrote: This issue (and previous one reported by Tom) has got some publicity recently - see here http://www.uknof.org.uk/uknof13/Bird-Redux.pdf So i feel like i need to provide a little bit more information about the outcome (sorry that it is delayed and not as full as previo

Re: [zfs-discuss] ZFS tale of woe and fail

2009-07-11 Thread roland
mhh, i think i`m afraid, too, as i also need to use zfs on a single, large lun. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS tale of woe and fail

2009-07-02 Thread Ross
It is a ZFS issue. My understanding is that ZFS has multiple copies of the uberblock, but only tries to use the most recent one on import, meaning that on rare occasions, it's possible to loose access to the pool even though the vast majority of your data is fine. I believe there is work going

Re: [zfs-discuss] ZFS tale of woe and fail

2009-07-01 Thread David Magda
On Jul 1, 2009, at 12:37, Victor Latushkin wrote: This issue (and previous one reported by Tom) has got some publicity recently - see here http://www.uknof.org.uk/uknof13/Bird-Redux.pdf Joyent also had issues a while back as well: http://tinyurl.com/ytyzs6 http://www.joyeur.com/2008/01/22/

Re: [zfs-discuss] ZFS tale of woe and fail

2009-07-01 Thread Victor Latushkin
On 19.01.09 12:09, Tom Bird wrote: Toby Thain wrote: On 18-Jan-09, at 6:12 PM, Nathan Kroenert wrote: Hey, Tom - Correct me if I'm wrong here, but it seems you are not allowing ZFS any sort of redundancy to manage. Every other file system out there runs fine on a single LUN, when things go

Re: [zfs-discuss] ZFS tale of woe and fail

2009-01-19 Thread Miles Nordin
> "b" == Blake writes: b> removing the zfs cache file located at /etc/zfs/zpool.cache b> might be an emergency workaround? just the opposite. There seem to be fewer checks blocking the autoimport of pools listed in zpool.cache than on 'zpool import' manual imports. I'd expect th

Re: [zfs-discuss] ZFS tale of woe and fail

2009-01-19 Thread Blake
Miles, that's correct - I got muddled in the details of the thread. I'm not necessarily suggesting this, but is this an occasion when removing the zfs cache file located at /etc/zfs/zpool.cache might be an emergency workaround? Tom, please don't try this until someone more expert replies to my qu

Re: [zfs-discuss] ZFS tale of woe and fail

2009-01-19 Thread Miles Nordin
> "nk" == Nathan Kroenert writes: > "b" == Blake writes: nk> I'm not sure how you can class it a ZFS fail when the Disk nk> subsystem has failed... The disk subsystem did not fail and lose all its contents. It just rebooted a few times. b> You can get a sort of redundanc

Re: [zfs-discuss] ZFS tale of woe and fail

2009-01-19 Thread Blake
You can get a sort of redundancy by creating multiple filesystems with 'copies' enabled on the ones that need some sort of self-healing in case of bad blocks. Is it possible to at least present your disks as several LUNs? If you must have an abstraction layer between ZFS and the block device, pre

Re: [zfs-discuss] ZFS tale of woe and fail

2009-01-19 Thread Tom Bird
Toby Thain wrote: > On 18-Jan-09, at 6:12 PM, Nathan Kroenert wrote: > >> Hey, Tom - >> >> Correct me if I'm wrong here, but it seems you are not allowing ZFS any >> sort of redundancy to manage. Every other file system out there runs fine on a single LUN, when things go wrong you have a fsck uti

Re: [zfs-discuss] ZFS tale of woe and fail

2009-01-18 Thread Toby Thain
On 18-Jan-09, at 6:12 PM, Nathan Kroenert wrote: > Hey, Tom - > > Correct me if I'm wrong here, but it seems you are not allowing ZFS > any > sort of redundancy to manage. Which is particularly catastrophic when one's 'content' is organized as a monolithic file, as it is here - unless, of co

Re: [zfs-discuss] ZFS tale of woe and fail

2009-01-18 Thread Nathan Kroenert
Hey, Tom - Correct me if I'm wrong here, but it seems you are not allowing ZFS any sort of redundancy to manage. I'm not sure how you can class it a ZFS fail when the Disk subsystem has failed... Or - did I miss something? :) Nathan. Tom Bird wrote: > Morning, > > For those of you who remem

Re: [zfs-discuss] ZFS tale of woe and fail

2009-01-18 Thread Tom Bird
Tim wrote: > On Sun, Jan 18, 2009 at 8:02 AM, Tom Bird > wrote: > errors: Permanent errors have been detected in the following files: > >content:<0x0> >content:<0x2c898> > > r...@cs4:~# find /content > /content > r...@cs4:~# (ye

Re: [zfs-discuss] ZFS tale of woe and fail

2009-01-18 Thread Joerg Schilling
Tim wrote: > On Sun, Jan 18, 2009 at 8:02 AM, Tom Bird wrote: > Those are supposedly the two inodes that are corrupt. The 0x0 is a bit > scary... you should be able to find out what file(s) they're tied to (if > any) with: > > find /content -inum 0 > find /content -inum 182424 Using find to s

Re: [zfs-discuss] ZFS tale of woe and fail

2009-01-18 Thread Tim
On Sun, Jan 18, 2009 at 8:02 AM, Tom Bird wrote: > Morning, > > For those of you who remember last time, this is a different Solaris, > different disk box and different host, but the epic nature of the fail > is similar. > > The RAID box that is the 63T LUN has a hardware fault and has been > cra