!@**#@**@  - Sorry, I hit send before I typed in my response...

On Mon, May 13, 2013 at 11:43 AM, Bill Bogstad <[email protected]> wrote:
> On Mon, May 13, 2013 at 8:59 AM, Edward Ned Harvey (lopser)
> <[email protected]> wrote:
>>> From: [email protected] [mailto:[email protected]]
>>> On Behalf Of Skylar Thompson
>>>
>>> I think checksumming has a place in backup/archive systems, but I'm not sure
>>> that end-to-end checksumming will allow sufficient scalability, at least 
>>> with
>>> current filesystem technology. At $WORK, if we had to checksum each file on
>>> each filesystem we backup, I doubt we could complete our backups in our
>>> window,
>>
>> Right on.  If you have block-level or filesystem-implemented data integrity, 
>> then you can rely on the filesystem.  But without it, the only way you can 
>> check is to run a huge intensive scan.  You definitely DON'T want to do that 
>> on every send, but you definitely WANT or NEED to do it sometimes.

For my personal files, I repurposed the AIDE intrusion detection
system to do this and even on 500G, it is painful.

>>> What I think /could/ work, though, is if checksumming filesystems like ZFS
>>> could expose the checksum data to user applications (like backup clients),
>>
>> The reason that's not possible is because the ZFS checksums don't relate to 
>> the files.  They relate to data blocks, which may be file fragments, or 
>> contain multiple files, and always include various forms of filesystem 
>> metadata.  So you'll always have to utilize your ZFS checksums via zfs 
>> internal commands.  You can scrub your whole pool...  There might be a 
>> fringe use case where it's useful to just validate the blocks that are 
>> related to certain files, without doing the whole pool...  But I can't think 
>> of such a use case.

If you have files/filesystems in a storage pool with different levels
of importance to you, you might want to scrub/validate some of them
more often then others.   If I was going to implement something like
that, I would want the interface to allow specifying individual files
or
directories (which would mean recursively check everything below it).
 Internally, it should translate between the external
files/directories view and ZFS' block view and only validate each
block once per invocation.   Perhaps by keeping a bit map of checked
blocks for each run.  Not knowing the internals of ZFS, I don't know
how feasible this would be.

Bill Bogstad
_______________________________________________
Tech mailing list
[email protected]
https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to