> The perl code will produce the same hash for  "abc.html" as for "bca.html"  
> That's probably one reason Leonard didn't try to transliterate the buggy code.
> 

Actually, to give credit where it's due, it wasn't me.  I just modified someone 
else's interesting solution in this thread and added the silly limit of 10000 
to it.

> In any case, the likelihood of a hash collision for any non-trivial website 
> is substantial.
> 

Exactly.  Four digits is hardly enough range for it to be even remotely safe.  
And even then range isn't really the issue as technically it just improves your 
odds.

The results of a modulus operator are still non-unique no matter how many 
digits are there to work with ... within reason.  Statistically anyone who buys 
a ticket could potentially win the lottery no matter how bad the odds are.  ;)

And now back to the OP, I'm still confused on this four-digit limitation.  Why 
isn't the limitation at least adhering to a bytelength like byte/short/long?  
Is this database storing a string of characters instead of an actual number?  
(And if so, then why not just block out 255 characters instead of 4 to store a 
whole path?  Or at the very least treat 4 characters as 4 bytes to greatly 
increase the numeric range?)
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to