On Tue, Sep 17, 2013 at 01:27:06PM +0000, hru...@gmail.com wrote: > Marc Espie <es...@nerim.net> wrote: > > > On Tue, Sep 17, 2013 at 07:23:07AM +0000, hru...@gmail.com wrote: > > > In the case of rsync the hash is applied to strings of a fixed lenth. > > > In this case the input is finite and we can argue with cardinality. > > > Just imagine the set finite strings mapped to a single element in the > > > range. If all these sets have the same number of elements and the range > > > n elements, then the probability of colition is n*(1/n)^2=1/nr; otherwise > > > it is greater (simple school agebra to calculate it). The extreme case > > > is that all strings are mapped to the same element. > > > > It doesn't really matter. You can go straight to the limit. If you choose > > a given collection of data, the chance of any other collection of data > > mapping to the exact same hash is 1/2^128, irregardless of its size. > > I state you the same question that I stated Raimo Niskanen in my previous > mail. > > I think you misunderstand me: I am not speaking about the size of the > input strings, but about the size of sets of input strings.
At least I can not follow your line of reasoning here. Suppose you are limited to length 1 strings of [a-z], then you have 29 possible strings. Still. If you select one of those strings and calculates its SHA-1 hash value. When you try any of the other 28 strings (or any other string of any lenght) and calculates this other strings SHA-1 hash value, the probability that it has the same hash value is 2 ^ (-160). > > Rodrigo. -- / Raimo Niskanen, Erlang/OTP, Ericsson AB