Hi list,

I've been looking at the Cassandra implementation of the gossip
protocol, and it appeared to me that the size of each
GossipDigestSynMessage would grow linearly with the size of the
system:

makeRandomGossipDigest():
......
List<InetAddress> endpoints = new 
ArrayList<InetAddress>(endpointStateMap.keySet());
Collections.shuffle(endpoints, random);

So if there's N servers participating in the gossip protocol, at each
cycle, there's totally N GossipDigestSynMessage's, each of a size linear
to N - the aggregate protocol overhead would be O(N**2).

I'm totally new to both Cassandra and Java, so this understanding
could be very wrong. But if it's true, why hasn't it been a scalability
concern? Is it because Cassandra servers are often geographically
distributed so the protocol overhead doesn't hit a single site? Or
does Cassandra servers gossip in a hierarchy of groups that I failed
to see?

Also, what purpose does the Collections.shuffle() serve?

Thanks and please kindly CC me on replies.

- Isaac

Reply via email to