What I'm saying is that I am getting conflicting information from your rebuttals here.
I (and others) say there will be collisions that will cause data loss if verify is off. You say it would be so rare as to be impossible from your perspective. Tomas says, well then lets just use the hash value for a 4096X compression. You fluff around his argument calling him names. I say, well then compute all the possible hashes for all possible bit patterns and demonstrate no dupes. You say it's not possible to do that. I illustrate a way that loss of data could cost you money. You say it's impossible for there to be a chance of me constructing a block that has the same hash but different content. Several people have illustrated that 128K to 32bits is a huge and lossy ratio of compression, yet you still say it's viable to leave verify off. I say, in fact that the total number of unique patterns that can exist on any pool is small, compared to the total, illustrating that I understand how the key space for the algorithm is small when looking at a ZFS pool, and thus could have a non-collision opportunity. So I can see what perspective you are drawing your confidence from, but I, and others, are not confident that the risk has zero probability. I'm pushing you to find a way to demonstrate that there is zero risk because if you do that, then you've, in fact created the ultimate compression factor (but enlarged the keys that could collide because the pool is now virtually larger), to date for random bit patterns, and you've also demonstrated that the particular algorithm is very good for dedup. That would indicate to me, that you can then take that algorithm, and run it inside of ZFS dedup to automatically manage when verify is necessary by detecting when a collision occurs. I appreciate the push back. I'm trying to drive thinking about this into the direction of what is known and finite, away from what is infinitely complex and thus impossible to explore. Maybe all the work has already been done… Gregg On Jul 11, 2012, at 11:02 AM, Sašo Kiselkov wrote: > On 07/11/2012 05:58 PM, Gregg Wonderly wrote: >> You're entirely sure that there could never be two different blocks that can >> hash to the same value and have different content? >> >> Wow, can you just send me the cash now and we'll call it even? > > You're the one making the positive claim and I'm calling bullshit. So > the onus is on you to demonstrate the collision (and that you arrived at > it via your brute force method as described). Until then, my money stays > safely on my bank account. Put up or shut up, as the old saying goes. > > -- > Saso _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss