Hi list, I've been looking at the Cassandra implementation of the gossip protocol, and it appeared to me that the size of each GossipDigestSynMessage would grow linearly with the size of the system:
makeRandomGossipDigest(): ...... List<InetAddress> endpoints = new ArrayList<InetAddress>(endpointStateMap.keySet()); Collections.shuffle(endpoints, random); So if there's N servers participating in the gossip protocol, at each cycle, there's totally N GossipDigestSynMessage's, each of a size linear to N - the aggregate protocol overhead would be O(N**2). I'm totally new to both Cassandra and Java, so this understanding could be very wrong. But if it's true, why hasn't it been a scalability concern? Is it because Cassandra servers are often geographically distributed so the protocol overhead doesn't hit a single site? Or does Cassandra servers gossip in a hierarchy of groups that I failed to see? Also, what purpose does the Collections.shuffle() serve? Thanks and please kindly CC me on replies. - Isaac