Richard Elling <[EMAIL PROTECTED]> Cromar Scott wrote: > Chris Siebenmann <[EMAIL PROTECTED]> > > I'm not Anton Rang, but: > | How would you describe the difference between the data recovery > | utility and ZFS's normal data recovery process? > > cks> The data recovery utility should not panic > cks> my entire system if it runs into some situation > cks> that it utterly cannot handle. Solaris 10 U5 > cks> kernel ZFS code does not have this property; > cks> it is possible to wind up with ZFS pools that > cks> will panic your system when you try to touch them. > ... > > I'll go you one worse. Imagine a Sun Cluster with several resource > groups and several zpools. You blow a proc on one of the servers. As a > result, the metadata on one of the pools becomes corrupted. >
re> This failure mode affects all shared-storage re> clusters. I don't see how ZFS should or should re> not be any different than raw, UFS, et.al. Absolutely true. The file system definitely had a problem. > http://mail.opensolaris.org/pipermail/zfs-discuss/2008-April/046951.html > > Now, each of the servers in your cluster attempts to import the > zpool--and panics. > > As a result of a singe part failure on a single server, your entire > cluster (and all the services on it) are sitting in a smoking heap on > your machine room floor. > re> Yes, but your data is corrupted. My data was only corrupted on ONE of the zpools. In a cluster with several zpools and several resource groups, we ended up with ALL of the pools and ALL of the resource groups offline as one node after another panicked. re> If you were my bank, then I would greatly re> appreciate you getting the data corrected re> prior to bringing my account online. Fair enough, but do we have to take Fred's and Joe's accounts offline too? re> If you study highly available clusters and services re> then you will see many cases where human interaction re> is preferred to automation for just such cases. I see your point about requiring intervention to deal with a potentially corrupt file system. I would have preferred a behavior more like we get with VxVM and VxFS, where the corrupted file system fails to mount without human intervention, but the nodes don't panic on the failed vxdg import. That particular service group and that particular file system are offline, but everything else keeps running because none of the other nodes panics. We handled the issue of not corrupting the file system further by panicking the original node, but I don't understand why we need to panic each other successive node in the cluster. Why can't we just refuse to import automatically? > I'm just glad that our pool corruption experience happened during > testing, and not after the system had gone into production. Not exactly > a resume-enhancing experience. re> I'm glad you found this in testing. I'm a believer. Some people wanted us to just throw the box into production, but I insisted on keeping our test schedule. I'm glad I did. re> BTW, what was the root cause? It appears that the metadata on that pool became corrupted when the processor failed. The exact mechanism is a bit of a mystery, since we didn't get a valid crash dump. The other pools were fine, once we imported them after a boot -x. We ended up converting to VxVM and VxFS on that server because we could not guarantee that the same thing wouldn't just happen again after we went into production. If we had a tool that had allowed us to roll back to a previous snapshot or something, it might have made a difference. We were told that the probability of metadata corruption would have been reduced but not eliminated by having a mirrored LUN. We were also told that the issue will be fixed in U6. --Scott This message may contain information that is confidential or privileged. If you are not the intended recipient, please advise the sender immediately and delete this message. _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss