Steven D'Aprano <[EMAIL PROTECTED]> writes: > > Sorry, make that 32 or 40 instead of 10, if the number of id's is large, > > to make birthday collisions unlikely. > > I did a small empirical test, and with 16 million ids, I found no > collisions.
16 million 32-byte ids? With string and dictionary overhead that's probably on the order of 1 GB. Anyway, 16 bytes is enough, as mentioned elsewhere. > However, I did find that trying to dispose of a set of 16 million short > strings caused my Python session to lock up for twenty minutes until I > got fed up and killed the process. Should garbage-collecting 16 million > strings really take 20+ minutes? Maybe your system was thrashing, or maybe the GC was happening during allocation (there was some discussion of that a while back). > > If you don't want the id's to be that large, you can implement a Feistel > I'm not sure that I need it, but I would certainly be curious to see it. I posted some code elsewhere in the thread. -- http://mail.python.org/mailman/listinfo/python-list