Re: ZFS...

Michelle Sullivan Tue, 30 Apr 2019 20:20:51 -0700


Michelle Sullivan
http://www.mhix.org/
Sent from my iPad

> On 01 May 2019, at 12:37, Karl Denninger <k...@denninger.net> wrote:
> 
> On 4/30/2019 20:59, Michelle Sullivan wrote
>>> On 01 May 2019, at 11:33, Karl Denninger <k...@denninger.net> wrote:
>>> 
>>>> On 4/30/2019 19:14, Michelle Sullivan wrote:
>>>> 
>>>> Michelle Sullivan
>>>> http://www.mhix.org/
>>>> Sent from my iPad
>>>> 
>>> Nope.  I'd much rather *know* the data is corrupt and be forced to
>>> restore from backups than to have SILENT corruption occur and perhaps
>>> screw me 10 years down the road when the odds are my backups have
>>> long-since been recycled.
>> Ahh yes the be all and end all of ZFS.. stops the silent corruption of 
>> data.. but don’t install it on anything unless it’s server grade with 
>> backups and ECC RAM, but it’s good on laptops because it protects you from 
>> silent corruption of your data when 10 years later the backups have 
>> long-since been recycled...  umm is that not a circular argument?
>> 
>> Don’t get me wrong here.. and I know you (and some others are) zfs in the DC 
>> with 10s of thousands in redundant servers and/or backups to keep your 
>> critical data corruption free = good thing.
>> 
>> ZFS on everything is what some say (because it prevents silent corruption) 
>> but then you have default policies to install it everywhere .. including 
>> hardware not equipped to function safely with it (in your own arguments) and 
>> yet it’s still good because it will still prevent silent corruption even 
>> though it relies on hardware that you can trust...  umm say what?
>> 
>> Anyhow veered way way off (the original) topic...
>> 
>> Modest (part consumer grade, part commercial) suffered irreversible data 
>> loss because of a (very unusual, but not impossible) double power outage.. 
>> and no tools to recover the data (or part data) unless you have some form of 
>> backup because the file system deems the corruption to be too dangerous to 
>> let you access any of it (even the known good bits) ...  
>> 
>> Michelle
> 
> IMHO you're dead wrong Michelle.  I respect your opinion but disagree
> vehemently.

I guess we’ll have to agree to disagree then, but I think your attitude to 
pronounce me “dead wrong” is short sighted, because it strikes of “I’m right 
because ZFS is the answer to all problems.” .. I’ve been around in the industry 
long enough to see a variety of issues... some disasters, some not so...

I also should know better than to run without backups but financial constraints 
precluded me.... as will for many non commercial people.

> 
> I run ZFS on both of my laptops under FreeBSD.  Both have
> non-power-protected SSDs in them.  Neither is mirrored or Raidz-anything.
> 
> So why run ZFS instead of UFS?
> 
> Because a scrub will detect data corruption that UFS cannot detect *at all.*

I get it, I really do, but that balances out against, if you can’t rebuild it 
make sure you have (tested and working) backups and be prepared for downtime 
when such corruption does occur.

> 
> It is a balance-of-harms test and you choose.  I can make a very clean
> argument that *greater information always wins*; that is, I prefer in
> every case to *know* I'm screwed rather than not.  I can defend against
> being screwed with some amount of diligence but in order for that
> diligence to be reasonable I have to know about the screwing in a
> reasonable amount of time after it happens.

Not disagreeing (and have not been.)

> 
> You may have never had silent corruption bite you.

I have... but not with data on disks..  most of my silent corruption issues 
have been with a layer or two above the hardware... like subversion commits 
overwriting previous commits without notification (damn I wish I could reliably 
replicate it!)

>   I have had it happen
> several times over my IT career.  If that happens to you the odds are
> that it's absolutely unrecoverable and whatever gets corrupted is
> *gone.*

Every drive corruption I have suffered in my career I have been able to 
recover, all or partial data except where the hardware itself was totally hosed 
(Ie clean room options only available)... even with brtfs.. yuk.. puck.. yuk.. 
oh what a mess that was...  still get nightmares on that one...  but I still 
managed to get most of the data off... in fact I put it onto this machine I 
currently have problems with.. so after the nightmare of brtfs looks like zfs 
eventually nailed me.

>   The defensive measures against silent corruption require
> retention of backup data *literally forever* for the entire useful life
> of the information because from the point of corruption forward *the
> backups are typically going to be complete and correct copies of the
> corrupt data and thus equally worthless to what's on the disk itself.* 
> With non-ZFS filesystems quite a lot of thought and care has to go into
> defending against that, and said defense usually requires the active
> cooperation of whatever software wrote said file in the first place

Say what?  

> (e.g. a database, etc.)

So dbs (any?) talk actively to the file systems (any?) to actively prevent 
silent corruption?

Lol...

I’m guessing you are actually talking about internal checks and balances of 
data in the DB to ensure that dat retrieved from disk is not corrupt/altered... 
 you know like writing sha256 checksums of files you might download from the 
internet to ensure you got what you asked for and it wasn’t changed/altered in 
transit.

>   If said software has no tools to "walk" said
> data or if it's impractical to have it do so you're at severe risk of
> being hosed.

Umm what?  I’m talking about a userland (libzfs) tool (Ie doesn’t need the pool 
imported) such as zfs send (which requires the pool to be imported - hence me 
not calling it a userland tool) to allow a sending of data that can be found to 
other places where it can be either blindly recovered (corruption might be 
present) or can be used to locate files/paths etc that are known to be good 
(checksums match etc).. walk the structures, feed the data elsewhere where it 
can be examined/recovered... don’t alter it.... it’s a last resort tool when 
you don’t have working backups..

>   Prior to ZFS there really wasn't any comprehensive defense
> against this sort of event.  There are a whole host of applications that
> manipulate data that are absolutely reliant on that sort of thing not
> happening (e.g. anything using a btree data structure) and recovery if
> it *does* happen is a five-alarm nightmare if it's possible at all.  In
> the worst-case scenario you don't detect the corruption and the data
> that has the pointer to it that gets corrupted is overwritten and 
> destroyed.
> 
> A ZFS scrub on a volume that has no redundancy cannot *fix* that
> corruption but it can and will detect it.

So you’re advocating restore from backup for every corruption ... ok...

>   This puts a boundary on the
> backups that I must keep in order to *not* have that happen.  This is of
> very high value to me and is why, even on systems without ECC memory and
> without redundant disks, provided there is enough RAM to make it
> reasonable (e.g. not on embedded systems I do development on with are
> severely RAM-constrained) I run ZFS.
> 
> BTW if you've never had a UFS volume unlink all the blocks within a file
> on an fsck and then recover them back into the free list after a crash
> you're a rare bird indeed.  If you think a corrupt ZFS volume is fun try
> to get your data back from said file after that happens.

Been there done that though with ext2 rather than UFS..  still got all my data 
back... even though it was a nightmare..

> 
> -- 
> Karl Denninger
> k...@denninger.net <mailto:k...@denninger.net>
> /The Market Ticker/
> /[S/MIME encrypted email preferred]/
_______________________________________________
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ZFS...

Reply via email to