On Sat, Oct 27, 2012 at 12:35 PM, Jim Klimov <jimkli...@cos.ru> wrote:
> 2012-10-27 20:54, Toby Thain wrote: > >> Parity is very simple to calculate and doesn't use a lot of CPU - just >>> slightly more work than reading all the blocks: read all the stripe >>> blocks on all the drives involved in a stripe, then do a simple XOR >>> operation across all the data. The actual checksums are more expensive >>> as they're MD5 - much nicer when these can be hardware accelerated. >>> >> >> Checksums are MD5?? >> > > No, they are fletcher variants or sha256, with more probably coming > up soon, and some of these might also be boosted by certain hardware > capabilities, but I tend to agree that parity calculations likely > are faster (even if not all parities are simple XORs - that would > be silly for double- or triple-parity sets which may use different > algos just to be sure). > I would expect raidz2 and 3 to use the same math as traditional raid6 for parity: https://en.wikipedia.org/wiki/Raid6#RAID_6 . In particular, the sentence "For a computer scientist, a good way to think about this is that <operator> is a bitwise XOR operator and <g superscript i> is the action of a linear feedback shift register on a chunk of data." If I understood it correctly, it does a different number of iterations of the LFSR on each sector, depending on which sector among the data sectors it is, and that the LFSR is applied independently to small groups of bytes in each sector, and then does the XOR to get the second parity sector (and for third parity, I believe it needs to use a different generator polynomial for the LFSR). For small numbers of iterations, multiple iterations of the LSFR can be optimized to a single shift and an XOR with a lookup value on the lowest bits. For larger numbers of iterations (if you have, say, 28 disks in a raidz3), it could construct the 25th iteration by doing 10, 10, 5, but I have no idea how ZFS actually implements it. As I understand it, fletcher checksums are extremely simple and are basically 2 additions and 2 modulus per however many bytes at a time it processes, so I wouldn't be surprised if fletcher was about the same speed as computing second/third parity. SHA256 I don't know, I would expect it to be more expensive, simply because it is a cryptographic hash. Tim
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss