Data security. I migrated my organization from Linux to Solaris driven away from Linux by the the shortfalls of fsck on TB size file systems, and towards Solaris by the features of ZFS.
At the time I tried to dig up information concerning tradeoffs associated with Fletcher2 vs. 4 vs. SHA256 and found nothing. Studying the algorithms, I decided that fletcher2 would tend to be weak for periodic data, which characterizes my data. I ran throughput tests and got 67MB/Sec for Fletcher2 and 4 and 48MB/Sec for SHA256. I projected (perhaps without basis) SHA256's cryptographic strength to also mean strength as a hash, and chose it since 48MB/Sec is more than I need. 21 months later (9/15/09) I lost everything to a "corrupt metadata" (Not sure where this was printed) ZFS-8000-CS. No clue why to date, I will never know. The person who restored from tape was not informed to set checksum=sha256, so it all went in with the default, Fletcher2. Before taking rather disruptive actions to correct this, I decided to question my original decision and found schlie's post stating that a bug in fletcher2 makes it essentially a one bit parity on the entire block: http://opensolaris.org/jive/thread.jspa?threadID=69655&tstart=30 While this is twice as good as any other file system in the world that has NO such checksum, this does not provide the security I migrated for. Especially given that I did not know what caused the original data loss, it is all I have to lean on. Convinced that I need to convert all of the checksums to sha256 to have the data security ZFS purports to deliver and in the absence of a checksum conversion capability, I need to copy the data. It appears that all of the implementations of the various means of copying data, from tar and cpio to cp to rsync to pax have ghosts in their closets, each living in glass houses, and each throwing stones at the other with respect to various issues with file size, filename lengths, pathname lengths, ACLs, extended attributes, sparse files, etc. etc. etc. It seems like zfs send/receive *should* be safe from all such issues as part of the zfs family, but the questions raised here are ambiguous once one starts to think about it. If the file system is faithfully duplicated, it should also duplicate all properties, including the checksum used on each block. It appears (to my advantage) that this is not what is done. This enables the filesystem spontaneously created by zfs receive to inherit from the pool, which evidently can be set to sha256 though it is a pool not a file system in the pool. The present question is protection on the base pool. This can be set when the pool is created, though not with U4 which I am running. It is not clear (yet) if this is simply not documented with the current release or if the version that supports this has not been released yet. If I were to upgrade (Which I cannot do in a timely fashion), it would only be to U7. I cannot run a "weekly build" type of OS on my production server. Any way it goes I am hosed. In short there is surely some structure, some blocks with stuff written in them when a pool is created but before anything else is done, else it would be a blank disk, not a zfs pool. Are these "protected" by Fletcher2 as the default? I have learned that the Ubberblock is protected by SHA256, other parts by Fletcher4. Is this everything? In U4 was it fletcher4, or was this a recent change steming from Schlie's report? In short, what is the situation with regard to the data security I switched to Solaris/ZFS for, and what can I do to achieve it? What *do* the tools do? Are there tools for what needs to be done to convert things, to copy things, to verify things, and to do so completely and correctly? So here is where I am: I should zfs send/receive, but I cannot have confidence that there are not fletcher2 protected blocks (1 bit parity) at the most fundamental levels of the zpool. To verify data, I cannot depend on existing tools since diff is not large file aware. My best idea at this point is to calculate and compare MD5 sums of every file and spot check other properties as best I can. Given this rather full perspective, help or comments very appreciated. I still think zfs is the way to go, but the road is a little bumpy at the moment. -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss