<https://lh3.googleusercontent.com/-ZVELcKNoB9M/V2kxgYmlFMI/AAAAAAAAB8Q/nR6jLFjKSI0611-WiQpQHXAcY3SueVIdwCLcB/s1600/Screen%2BShot%2B2016-06-21%2Bat%2B13.21.24.png>

>
> Another thing I've noticed is that you are using (System/currentTimeMillis
> ) to get the wall clock on every generation.
>
> (System/currentTimeMillis) causes a low level system call which in turn 
> causes a context switch.
>
> Maybe one way to improve could be use a initial (System/currentTimeMillis) 
> on the first init! and then
> use System/nanoTime to calculate the time elapsed from the init.
> The advantage would be that System/nanoTime runs in the UserSpace (not 
> Kernel Space) and it doesn't require
> a system call (so no context switch).
>
> This could really help the case of a bulk production of IDs and any other 
> burst situation.
>
>
> I really like this idea. I’m certainly open to pull requests if you wanted 
> to take a stab at it otherwise I may try my hand at making this 
> improvement. :)
>

Hi this change it is actually easier than it sounds. Looking at the code, I 
came across a couple of things which I think might be better.

1) use of filesystem persistence.

Not too sure that the file based persistence is a good idea. Maybe this is 
a good idiomatic design for Erlang, but definitely it doesn't look nice in 
Clojure.
 
In particular I'm not too sure that by storing the init time epoc we 
actually accomplish anything at all.
I would argue that there are a number of problems there, race conditions on 
data, tmp file purged out, and still doesn't protect against the case the 
clock drift during the use.

2) use of CAS (atom) for storing the VM state.
If if is truly decentralized then you shouldn't need an atom at all. The 
disadvantage of the CAS is that, when many thread race to the same change, 
only one will succeed and all the other ones will fail and retry. Which 
mean that if you have 100 threads (for example) only 1 will succeed all the 
other 99 will fail and retry. Again at the second round only 1 will succeed 
and 98 will retry, and so on.
Therefore the total number of attempts will be 

<https://lh3.googleusercontent.com/-ZVELcKNoB9M/V2kxgYmlFMI/AAAAAAAAB8Q/nR6jLFjKSI0611-WiQpQHXAcY3SueVIdwCLcB/s1600/Screen%2BShot%2B2016-06-21%2Bat%2B13.21.24.png>

If you want to develop a real "*decentralized*" id generator, I think, you 
need to drop the atom in favour of a thread local store.
Now to do so and make collision impossible we need to add more bits:


   -     64 bits - ts (i.e. a timestamp )
   -     48 bits - worker-id/node (i.e. MAC address)
   -     32 bits - worker-id/process (pid) 
   -     64 bits - worker-id/thread (thread num)
   -     32 bits - seq-no (i.e. a counter)
   
By adding the process id (pid) and the thread id there is possibility of 
having two systems running and creating the same id at the same time.
Finally by using thread-local storage there is no need of process level 
coordination (atom) and no risk of retries because every process is 
stepping on each others toes.

With such setup 100 threads will be able to increment their own thread 
local counter independently (given that you have 100 execution cores).

What do you think?
Bruno

 

>
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to