On Thu, 26 Jul 2012 14:26:16 +0200, Laszlo Nagy wrote: > I do not want this program to generate very long identifiers. It would > increase SQL parsing time,
Will that increase in SQL parsing time be more, or less, than the time it takes to generate CRC32 or SHA hashsums and append them to a truncated identifier? > * Would it be a problem to use CRC32 instead of SHA? (Since security is > not a problem, and CRC32 is faster.) What happens if you get a collision? That is, you have two different long identifiers: a.b.c.d...something a.b.c.d...anotherthing which by bad luck both hash to the same value: a.b.c.d.$AABB99 a.b.c.d.$AABB99 (or whatever). > * I'm truncating the digest value to 10 characters. Is it safe enough? > I don't want to use more than 10 characters, because then it wouldn't be > possible to recognize the original name. > * Can somebody think of a > better algorithm, that would give a bigger chance of recognizing the > original identifier from the modified one? Rather than truncating the most significant part of the identifier, the field name, you should truncate the least important part, the middle. a.b.c.d.e.f.g.something goes to: a.b...g.something or similar. -- Steven -- http://mail.python.org/mailman/listinfo/python-list