On Tue, Sep 17, 2013 at 07:23:07AM +0000, hru...@gmail.com wrote: > In the case of rsync the hash is applied to strings of a fixed lenth. > In this case the input is finite and we can argue with cardinality. > Just imagine the set finite strings mapped to a single element in the > range. If all these sets have the same number of elements and the range > n elements, then the probability of colition is n*(1/n)^2=1/nr; otherwise > it is greater (simple school agebra to calculate it). The extreme case > is that all strings are mapped to the same element.
It doesn't really matter. You can go straight to the limit. If you choose a given collection of data, the chance of any other collection of data mapping to the exact same hash is 1/2^128, irregardless of its size.