I found the MD5 and SHA hashes slow to calculate. The builtin hash is fast but I was concerned about collisions. What rate of collisions could I expect?
Outside attacks not an issue and multiple processes would be used. On Wed, Nov 14, 2012 at 11:26 AM, Chris Kaynor <ckay...@zindagigames.com> wrote: > One option would be using a hash. Python's built-in hash, a 32-bit > CRC, 128-bit MD5, 256-bit SHA or one of the many others that exist, > depending on the needs. Higher bit counts will reduce the odds of > accidental collisions; cryptographically secure ones if outside > attacks matter. In such a case, you'd have to roll your own means of > converting the hash back into the string if you ever need it for > debugging, and there is always the possibility of collisions. A > similar solution would be using a pseudo-random GUID using the url as > the seed. > > You could use a counter if all IDs are generated by a single process > (and even in other cases with some work). > > If you want to be able to go both ways, using base64 encoding is > probably your best bet, though you might get benefits by using > compression. > Chris > > > On Tue, Nov 13, 2012 at 3:56 PM, Richard <richar...@gmail.com> wrote: >> Good point - one way encoding would be fine. >> >> Also this is performed millions of times so ideally efficient. >> >> >> On Wednesday, November 14, 2012 10:34:03 AM UTC+11, John Gordon wrote: >>> In <0692e6a2-343c-4eb0-be57-fe5c815ef...@googlegroups.com> Richard >>> <richar...@gmail.com> writes: >>> >>> >>> >>> > I want to create a URL-safe unique ID for URL's. >>> >>> > Currently I use: >>> >>> > url_id = base64.urlsafe_b64encode(url) >>> >>> >>> >>> > >>> base64.urlsafe_b64encode('docs.python.org/library/uuid.html') >>> >>> > 'ZG9jcy5weXRob24ub3JnL2xpYnJhcnkvdXVpZC5odG1s' >>> >>> >>> >>> > I would prefer more concise ID's. >>> >>> > What do you recommend? - Compression? >>> >>> >>> >>> Does the ID need to contain all the information necessary to recreate the >>> >>> original URL? >>> >>> >>> >>> -- >>> >>> John Gordon A is for Amy, who fell down the stairs >>> >>> gor...@panix.com B is for Basil, assaulted by bears >>> >>> -- Edward Gorey, "The Gashlycrumb Tinies" >> >> -- >> http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list