Michael Ludwig wrote:
André Warnier schrieb am 15.05.2010 um 22:47:14 (+0200):
A tip : to get a really unique identifier, I often use
"yyyymmddhhmmssrrrrr", where rrrrr is the rand() result,
and the prefix is the date/time. Unless you process more
than 99,999 requests per second, it will always be unique.
I'm probably missing something trivial, but how do you
enforce uniqueness over rrrrr for 99,999 consecutive calls
to rand?
The perldoc doesn't promise any uniqueness, only randomness,
which isn't uniqueness.
You are right, I should not have said "always". It is still not
guaranteed, and does not "enforce" uniqueness.
But yyyymmddhhmmssrrrrr means that you would need a duplicate within the
same second, which is quite a bit less likely than over a much longer
period of time.
I have never done the math to calculate exactly how likely it was that I
would get 2 identical identifiers for any given number of
requests/second. Nor what adding one rand() digit would do to this
probability.
Let's say this : I have a dozen websites where the above is being used.
None of them even approaches 99,999 requests/second, and none of them
necessarily generates a unique-id at each request.
But the setup is such that, should a duplicate identifier be generated,
the application stops with an error. I did it so because I was curious,
and because my sites are not Google.
Over a period of about 5 years, it has never happened.
Probabilities being what they are, it does not mean that it cannot be
happening at this very second. But then also, the Sun may have turned
into a supernova 6 minutes ago.
One alternative is to use some strictly incremental counter, shared
between multiple processes running on potentially multiple systems or
CPUs. This requires a common place to store the counter, which survives
a system restart, and it requires some lock-read-increment-unlock
mechanism. I don't know any really fast and efficient way of doing
this. I am interested however if anyone knows one.