Christopher Brown asked me:
> Is there a reason a moderately strong random GUID generator is not enough?

But I'm not the OP who wanted unique node-IDs, and the above doesn't
seem to need to remain private, and it's also an interesting problem,
so I'm going to post my reply to the list:

Not if you're okay with non-deterministic but almost-reliable name
collision prevention.

One thing the OP could try, depending on their peer discovery
protocol, is to have newly activated nodes query existing nodes if a
randomly-generated ID is in use. If no reply indicating it's in use
arrives after a while, use it, otherwise generate another and try
again. Of course there's a bit of a race if two nodes try to join
using the same ID at the same time.

I think the only way to completely eliminate collisions might be to
have a single, manually-designated supernode whose job is to
coordinate these things -- the analogue of a DNS root server, or of
ICANN itself, in a sense. New nodes would talk to it first, and it
would assign them ids that could be sequentially generated, e.g.:

(defn make-generator [s]
  (let [x (ref s)]
    (fn []
      (dosync
        (let [y (first @x)]
          (alter x rest)
          y)))))

(def next-id (make-generator (iterate inc 0)))

user=> (next-id)
0
user=> (next-id)
1
user=> (next-id)
2

Note that (iterate inc 0) can be replaced with any other desired
infinite, duplicate-free lazy sequence in the above, and it should be
thread-safe (in particular, not generating duplicate IDs even with
concurrency).

If one really must go fully decentralized, one may need to design
everything to tolerate a low rate of collisions. Alternatively, if one
can implement a distributed ACID database *without* already having
unique node IDs, then one can implement the above make-generator to
use that instead of dosync, without invoking a chicken-and-egg
problem, and each node's bootstrap process can be, basically, 1. start
up as a D-ACID node, 2. generate node ID via a D-ACID transaction, 3.
bootstrap the rest of the way. However, implementing a distributed
ACID database is probably nontrivial. :)

In the long enough run, ID handling would also become more cycle- and
memory-intensive as the IDs got bigger than a long and had to become
bignums, and then as the bignums grew. But this process would be
logarithmic in time with a node population that stayed roughly
constant and linear if the node population grew exponentially, while
computer capacity and speed grow exponentially with time, so I'm not
regarding this as a problem.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to