Sorry, this is all bunk.  You shouldn't be worried about
an accidental collision.  You should be worried about
an intentional collision.  Especially if your filesystem
stores data that is under the attackers control such as
email messages, web page caches, etc.  So what you need
to analyze isn't how often an accidental collision happens
but how hard it is to create an intentional collision.
All the popular hash algorithms have been losing ground to
attackers lately.

can you make this a little more concrete?  i'm having trouble
understanding how a email that an attacker controls is
a problem.  assuming the attacker can predict the headers
add well enough, this implies that the attacker, given access to
your venti, can retrieve an email said attacker sent.  where's
the problem?  i don't see it yet.

OK, lets assume that the attacker has the most powerful attack
against a hash available in which he can construct a garbage
block of data (perhaps with some control of its content) that
hashes to a value of his choosing.  Now he predicts some data
that is likely to be written to your filesystem soon (say a
brand knew pull update that you havent pulled yet), makes
an email that has a data block in it that collides with that
block, sends that email to you.  Your filesystem stores it.
Later you do a pull and venti notices that you don't have to
store one of the blocks because it already has a block stored
with that same hash.  Now one of your files is corrupt.

Now in actuality an attacker probably doesn't have this strong
of an attack against your hash right now.  But he might have
much weaker attacks that he can use creatively to cause some
collisions that lead to corruption of data. These attacks would
be much harder, but with enough creativity you can do some
intersting things.  For example, see:
http://www.win.tue.nl/hashclash/rogue-ca/

- erik

Tim Newsham | www.thenewsh.com/~newsham | thenewsh.blogspot.com

Reply via email to