Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

Richard Elling Tue, 21 Jul 2009 10:23:08 -0700

On Jul 20, 2009, at 12:48 PM, Frank Middleton wrote:

On 07/19/09 06:10 PM, Richard Elling wrote:

Not that bad. Uncommitted ZFS data in memory does not tend to
live that long. Writes are generally out to media in 30 seconds.


Yes, but memory hits are instantaneous. On a reasonably busy
system there may be buffers in queue all the time. You may have
a buffer in memory for 100uS but it only takes 1nS for that buffer
to be clobbered. If that happened to be metadata about to be written
to both sides of a mirror than you are toast.  Good thing this
never happens, right :-)


I never win the lottery either :-)

Beware, if you go down this path of thought for very long, you'llsoonbe afraid to get out of bed in the morning... wait... most peopleactually
die in beds, so perhaps you'll be afraid to go to bed instead :-)


Not at all. As with any rational business, my servers all have ECC,
and getting up and out isn't a problem :-). Maybe I've had too many
disks go bad, so I have ECC, mirrors, and backup to a system with
ECC and mirrors (and copies=2, as well). Maybe I've read too many
of your excellent blogs :-).

Sun doesn't even sell machines without ECC. There's a reason forthat.

Yes, but all of the discussions in this thread can be classified as
systems engineering problems, not product design problems.


Not sure I follow. We've had this discussion before. OSOL+ZFS lets
you build enterprise class systems on cheap hardware that has errors.
ZFS gives the illusion of being fragile because it, uniquely, reports
these errors. Running OSOL as a VM in VirtualBox using MSWanything
as a host is a bit like building on sand, but there's nothing in
documentation anywhere to even warn folks that they shouldn't rely
on software to get them out of trouble on cheap hardware. ECC is
just one (but essential) part of that.


It is a systems engineering problem because ZFS is working as designed
and VirtualBox is also working as designed.  If you file a bug against
either, the bug should be closed as "not a defect." That means the
responsibility for making sure that the two interoperate lies at the
systems level -- where systems engineers do their job. For an analogy,

guns don't kill people, bullets kill people. The gun is just aplatform fordirecting bullets. If you shoot yourself in the foot, then the failureis notwith the gun or bullet, it is one layer above -- in the system. Ithurts

when you do that, so don't do that.


On 07/19/09 08:29 PM, David Magda wrote:

It's a nice-to-have, but at some point we're getting into the tinfoil
hat-equivalent of data protection.


But it is going to happen! Sun sells only machines with ECC because
that is the only way to ensure reliability. Someone who spends weeks
building a media server at home isn't going to be happy if they lose
one media file let alone a whole pool. At least they should be warned
that without ECC at some point they will lose files. I'm not convinced

that there is any reasonable scenario for losing an entire poolthough,

which was the original complaint in this thread.

Even trusty old SPARCs occasionally hang without a panic (in my
experience especially when a disk is about to go bad). If this
happens, and you have to power cycle because even stop-A doesn't
respond, are you all saying that there is a risk of losing a pool
at that point? Surely the whole point of a journalled file system
is that it is pretty much proof against any catastrophe, even the
one described initially.

There have been a couple of (to me) unconvincing explanations of
how this pool was lost.


It is quite simple -- ZFS sent the flush command and VirtualBox
ignored it. Therefore the bits on the persistent store are consistent.

Surely if there is a mechanism whereby
unflushed i/os can cause fatal metadata corruption, this should
be a high priority bug since this can happen on /any/ hardware; it
is just more likely if the foundations are shaky, so the explanation
must require more than that if it isn't a bug.


It isn't a bug in ZFS or VirtualBox. They work as designed.
As has been mentioned before, many times, the recovery of the
data is now a forensics exercise.  ZFS knows is that the consistency
is broken and is implementing the policy that consistency is more
important than automated access.
 -- richard

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

Reply via email to