Paul Rubin wrote: > James Stroud <[EMAIL PROTECTED]> writes: > >>Then your best bet is to take a reasonable number of bits from an sha hash. >>But you do not need pycrypto for this. The previous answer by "ncf" is good, >>but use the standard library and take 9 digits to lessen probability for >>clashes >> >>import sha >>def encrypt(x,y): >> def _dosha(v): return sha.new(str(v)).hexdigest() >> return int(_dosha(_dosha(x)+_dosha(y))[5:13],16) >>... >>Each student ID should be unique until you get a really big class. If your >>class might grow to several million, consider taking more bits of the hash. > > > Please don't give advice like this unless you know what you're doing. > You're taking 8 hex digits and turning them into an integer. That > means you'll probably have a collision after around 65,000 id's, not > several million. "Probably" means > 50%. You'll have a significant > chance (say more than 1%) of collision after maybe 10,000. > > Also, if you know the student's graduation year, in most cases there > are just a few hundred likely birthdates for that student, so by brute > force search you can crunch the output of your function to a fairly > small number of DOB/SSN combinations. > > The only approach that makes sense is for the secure database to > assign arbitrary numbers that aren't algorithmically related to any > sensitive data. Answers involving encryption will need to use either > large ID numbers or secret keys, both of which will cause hassles.
This is indubitably true. There's absolutely no excuse for making the primary key a function of the data that record contains, as doing so will assist any cryptanalytical attacks. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list