!@**#@**@ - Sorry, I hit send before I typed in my response... On Mon, May 13, 2013 at 11:43 AM, Bill Bogstad <[email protected]> wrote: > On Mon, May 13, 2013 at 8:59 AM, Edward Ned Harvey (lopser) > <[email protected]> wrote: >>> From: [email protected] [mailto:[email protected]] >>> On Behalf Of Skylar Thompson >>> >>> I think checksumming has a place in backup/archive systems, but I'm not sure >>> that end-to-end checksumming will allow sufficient scalability, at least >>> with >>> current filesystem technology. At $WORK, if we had to checksum each file on >>> each filesystem we backup, I doubt we could complete our backups in our >>> window, >> >> Right on. If you have block-level or filesystem-implemented data integrity, >> then you can rely on the filesystem. But without it, the only way you can >> check is to run a huge intensive scan. You definitely DON'T want to do that >> on every send, but you definitely WANT or NEED to do it sometimes.
For my personal files, I repurposed the AIDE intrusion detection system to do this and even on 500G, it is painful. >>> What I think /could/ work, though, is if checksumming filesystems like ZFS >>> could expose the checksum data to user applications (like backup clients), >> >> The reason that's not possible is because the ZFS checksums don't relate to >> the files. They relate to data blocks, which may be file fragments, or >> contain multiple files, and always include various forms of filesystem >> metadata. So you'll always have to utilize your ZFS checksums via zfs >> internal commands. You can scrub your whole pool... There might be a >> fringe use case where it's useful to just validate the blocks that are >> related to certain files, without doing the whole pool... But I can't think >> of such a use case. If you have files/filesystems in a storage pool with different levels of importance to you, you might want to scrub/validate some of them more often then others. If I was going to implement something like that, I would want the interface to allow specifying individual files or directories (which would mean recursively check everything below it). Internally, it should translate between the external files/directories view and ZFS' block view and only validate each block once per invocation. Perhaps by keeping a bit map of checked blocks for each run. Not knowing the internals of ZFS, I don't know how feasible this would be. Bill Bogstad _______________________________________________ Tech mailing list [email protected] https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech This list provided by the League of Professional System Administrators http://lopsa.org/
