On 07/11/2012 03:58 PM, Edward Ned Harvey wrote: >> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- >> boun...@opensolaris.org] On Behalf Of Sašo Kiselkov >> >> I really mean no disrespect, but this comment is so dumb I could swear >> my IQ dropped by a few tenths of a point just by reading. > > Cool it please. You say "I mean no disrespect" and then say something which > is clearly disrespectful.
I sort of flew off the handle there, and I shouldn't have. It felt like Tomas was misrepresenting my position and putting words in my mouth I didn't say. I certainly didn't mean to diminish the validity of an honest question. > Tomas's point is to illustrate that hashing is a many-to-one function. If > it were possible to rely on the hash to always be unique, then you could use > it as a compression algorithm. He's pointing out that's insane. His > comment was not in the slightest bit dumb; if anything, it seems like maybe > somebody (or some people) didn't get his point. I understood his point very well and I never argued that hashing always results in unique hash values, which is why I thought he was misrepresenting what I said. So for a full explanation of why hashes aren't usable for compression: 1) they are one-way (kind of bummer for decompression) 2) they operate far below the Shannon limit (i.e. unusable for lossless compression) 3) their output is pseudo-random, so even if we find collisions, we have no way to distinguish which input was the most likely one meant for a given hash value (all are equally probable) A formal proof would of course take longer to construct and would take time that I feel is best spent writing code. Cheers, -- Saso _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss