Michael Stenner wrote:
> On Fri, May 17, 2002 at 04:15:34PM -0400, Theo Van Dinter wrote:
>>I would be extremely surprised if two people report different messages
>>that result in the same hash.  Although completely possible, it's also
>>very very unlikely.
> Someone said on this list that razor uses SHA1 (which I know to be
> true) and that SHA1 creates 20 byte hashes (which I'll assume to be
> true for the moment)
> 20 bytes = 160 bits
> Assuming it's a good (evenly distributed hash) and you're not
> intentionally trying to match something, you have the following data:
> odds of one
> new hash matching
> one already                number of hashes 
> in the db                  in the db
> =================          ================
> 2^-160 = 10^-49            1 
> 2^-140 = 10^-43            10^6  = 2^20
> 2^-110 = 10^-34            10^15 = 2^50
> Now, with odds of about 10^-34, if you decide you're going to try
> enough hashes to give yourself a 1% CHANCE of finding one, you only
> need to try
> 0.01 * 10^34 = 10^32 times.  at 1,000,000,000 tries per second, that
> will only take you  10^23 seconds = roughly the age of the universe.
> By the way, 10^15 hashes is about 160,000 TeraBytes !
>                                       -Michael
> [ this is a rough order-of-magnitude calculation only.  If you're
> deliberately attacking the DB, you can do slightly better, but I just
> wanted to make it clear, that neither accidents nor spammers are
> likely to pose a serious problem ]

Unfortunately that's not strictly true.

You could very easily poison the database using short english phrases. I 
see an awful lot of emails that just contain single words, such as: 
"Hello???" or "How did it go?" etc. Generating things like that using a 
Markov Chain system wouldn't be terribly hard.


Hundreds of nodes, one monster rendering program.
Now that's a super model! Visit http://clustering.foundries.sf.net/

Spamassassin-talk mailing list

Reply via email to