On Fri, Jun 25, 2010 at 8:45 AM, <michael.fe...@evonik.com> wrote: > 4. you under estimate the error done by misusing math. methods. > > As I already said in my first e-mail. SHA-1 is developed > to detected random and willful data manipulation. > It's a cryptographic hash, so that there is a low chance of > guessing or calculation a derived data sequence, > which generates the same hash value as the original data. > But this is the only thing it ensures. > There is no evidence that the hash vales are > equally distributed on the data sets, which is import for > the us of hashing method in data fetching. > In fact, as it's a cryptographic hash, > you should not be able to calculate it, > because this would mean that you are able > to calculate sets of data resulting in the same hash value. > So you can't conclude from the low chance of > guessing or calculation a derived data sequence to > a low chance of hash collisions in general.
I am in favor of making our software more reliable, I just do not want to see us handicap ourselves by programming against a problem that is unlikely to ever happen. If this is so risky, then why are so many people using git? Isn't it built entirely on this concept of using sha-1 hashes to identify content? While I notice if you Google for this you can find plenty of flame wars over this topic with Git, but I also notice blog posts like this one: http://theblogthatnoonereads.davegrijalva.com/2009/09/25/sha-1-collision-probability/ We are already performance-challenged. Doing extra hash calculations for a problem that is not going to happen does not seem like a sound decision. -- Thanks Mark Phippard http://markphip.blogspot.com/