On 01/22/2013 03:30 PM, Leonard, Arah wrote:
The perl code will produce the same hash for  "abc.html" as for "bca.html"  
That's probably one reason Leonard didn't try to transliterate the buggy code.


Actually, to give credit where it's due, it wasn't me.  I just modified someone 
else's interesting solution in this thread and added the silly limit of 10000 
to it.


That's okay. The OP doesn't seem to know anything about programming, or about information theory, so the fact you gave a single line that actually "works" must be extraordinarily valuable to him. When he was trying to use the md5 module, I gave him the hints about his five programming errors, and was about to expand on it when i noticed his 4 digit limitation.

In any case, the likelihood of a hash collision for any non-trivial website is 
substantial.


Exactly.  Four digits is hardly enough range for it to be even remotely safe.  
And even then range isn't really the issue as technically it just improves your 
odds.

The results of a modulus operator are still non-unique no matter how many 
digits are there to work with ... within reason.  Statistically anyone who buys 
a ticket could potentially win the lottery no matter how bad the odds are.  ;)

And now back to the OP, I'm still confused on this four-digit limitation.  Why 
isn't the limitation at least adhering to a bytelength like byte/short/long?  
Is this database storing a string of characters instead of an actual number?  
(And if so, then why not just block out 255 characters instead of 4 to store a 
whole path?  Or at the very least treat 4 characters as 4 bytes to greatly 
increase the numeric range?)


I wish I had done the internet search earlier. This name 'ferrous cranus' is a pseudonym of various trolls, and anybody who'd adopt it isn't worth our time.

Thanks to Alan Spence for spotting that.  I'll plonk 'ferrous cranus' now.


--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to