Re: Re: dangerous implementation of rep-sharing cache for fsfs

Mark Phippard Fri, 25 Jun 2010 06:00:37 -0700

On Fri, Jun 25, 2010 at 8:45 AM,  <michael.fe...@evonik.com> wrote:
> 4. you under estimate the error done by misusing math. methods.
>
>   As I already said in my first e-mail. SHA-1 is developed
>   to detected random and willful data manipulation.
>   It's a cryptographic hash, so that there is a low chance of
>   guessing or calculation a derived data sequence,
>   which generates the same hash value as the original data.
>   But this is the only thing it ensures.
>   There is no evidence that the hash vales are
>   equally distributed on the data sets, which is import for
>   the us of hashing method in data fetching.
>   In fact, as it's a cryptographic hash,
>   you should not be able to calculate it,
>   because this would mean that you are able
>   to calculate sets of data resulting in the same hash value.
>   So you can't conclude from the low chance of
>   guessing or calculation a derived data sequence to
>   a low chance of hash collisions in general.


I am in favor of making our software more reliable, I just do not want
to see us handicap ourselves by programming against a problem that is
unlikely to ever happen.  If this is so risky, then why are so many
people using git?  Isn't it built entirely on this concept of using
sha-1 hashes to identify content?  While I notice if you Google for
this you can find plenty of flame wars over this topic with Git, but I
also notice blog posts like this one:

http://theblogthatnoonereads.davegrijalva.com/2009/09/25/sha-1-collision-probability/

We are already performance-challenged.  Doing extra hash calculations
for a problem that is not going to happen does not seem like a sound
decision.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: Re: dangerous implementation of rep-sharing cache for fsfs

Reply via email to