Darren J Moffat <darr...@opensolaris.org> writes: > Kjetil Torgrim Homme wrote: >> Andrey Kuzmin <andrey.v.kuz...@gmail.com> writes: >> >>> Downside you have described happens only when the same checksum is >>> used for data protection and duplicate detection. This implies sha256, >>> BTW, since fletcher-based dedupe has been dropped in recent builds. >> >> if the hash used for dedup is completely separate from the hash used >> for data protection, I don't see any downsides to computing the dedup >> hash from uncompressed data. why isn't it? > > It isn't separate because that isn't how Jeff and Bill designed it.
thanks for confirming that, Darren. > I think the design the have is great. I don't disagree. > Instead of trying to pick holes in the theory can you demonstrate a > real performance problem with compression=on and dedup=on and show > that it is because of the compression step ? compression requires CPU, actually quite a lot of it. even with the lean and mean lzjb, you will get not much more than 150 MB/s per core or something like that. so, if you're copying a 10 GB image file, it will take a minute or two, just to compress the data so that the hash can be computed so that the duplicate block can be identified. if the dedup hash was based on uncompressed data, the copy would be limited by hashing efficiency (and dedup tree lookup). I don't know how tightly interwoven the dedup hash tree and the block pointer hash tree are, or if it is all possible to disentangle them. conceptually it doesn't seem impossible, but that's easy for me to say, with no knowledge of the zio pipeline... oh, how does encryption play into this? just don't? knowing that someone else has the same block as you is leaking information, but that may be acceptable -- just make different pools for people you don't trust. > Otherwise if you want it changed code it up and show how what you have > done is better in all cases. I wish I could :-) -- Kjetil T. Homme Redpill Linpro AS - Changing the game _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss