Bruno, I think the more you can reduce the chance of collision the better and the thread-local capability is a good idea, but in the process you've almost doubled the bits.
For me anyhow, an ID need to be produceable at a reasonable rate (1 million a second per machine is good for me), have near-zero probability of collision and take up the least amount of space possible. Under those criteria, I think 128 bits is a reasonable target and the thread-safe atom I would expect to handle such volume (although I haven't tested). If you need a billion per second and don't want 100 machines producing them, then I think you are at the point of needing to have thread independence and probably have to increase the bit-count, and your ideas provide a good path towards such a solution. Your comment on the file persistence is a good one, I wonder if the potential problems are real enough to warrant the risks. My other curiosity is if System/nanoTime is guaranteed to increment across threads. I know at least a while ago that this guarantee did not exist. -Brian On Tuesday, June 21, 2016 at 8:38:58 AM UTC-4, Bruno Bonacci wrote: > > > Hi this change it is actually easier than it sounds. Looking at the code, > I came across a couple of things which I think might be better. > > 1) use of filesystem persistence. > > Not too sure that the file based persistence is a good idea. Maybe this is > a good idiomatic design for Erlang, but definitely it doesn't look nice in > Clojure. > > In particular I'm not too sure that by storing the init time epoc we > actually accomplish anything at all. > I would argue that there are a number of problems there, race conditions > on data, tmp file purged out, and still doesn't protect against the case > the clock drift during the use. > > 2) use of CAS (atom) for storing the VM state. > If if is truly decentralized then you shouldn't need an atom at all. The > disadvantage of the CAS is that, when many thread race to the same change, > only one will succeed and all the other ones will fail and retry. Which > mean that if you have 100 threads (for example) only 1 will succeed all the > other 99 will fail and retry. Again at the second round only 1 will succeed > and 98 will retry, and so on. > Therefore the total number of attempts will be > > > <https://lh3.googleusercontent.com/-ZVELcKNoB9M/V2kxgYmlFMI/AAAAAAAAB8Q/nR6jLFjKSI0611-WiQpQHXAcY3SueVIdwCLcB/s1600/Screen%2BShot%2B2016-06-21%2Bat%2B13.21.24.png> > > If you want to develop a real "*decentralized*" id generator, I think, > you need to drop the atom in favour of a thread local store. > Now to do so and make collision impossible we need to add more bits: > > > - 64 bits - ts (i.e. a timestamp ) > - 48 bits - worker-id/node (i.e. MAC address) > - 32 bits - worker-id/process (pid) > - 64 bits - worker-id/thread (thread num) > - 32 bits - seq-no (i.e. a counter) > > By adding the process id (pid) and the thread id there is possibility of > having two systems running and creating the same id at the same time. > Finally by using thread-local storage there is no need of process level > coordination (atom) and no risk of retries because every process is > stepping on each others toes. > > With such setup 100 threads will be able to increment their own thread > local counter independently (given that you have 100 execution cores). > > What do you think? > Bruno > > > >> >> -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.