On 29 August 2013 01:55, Ike Walker <ike.wal...@flite.com> wrote: > What is the best practice for how many seed nodes to have in a Cassandra > cluster? I remember reading a recommendation of 2 seeds per datacenter in > Datastax documentation for 0.7, but I'm interested to know what other > people are doing these days, especially in AWS. > > I'm running a cluster of 12 nodes at AWS. Each node runs Cassandra 1.2.5 > on an m1.xlarge EC2 instance, and they are spread across 3 availability > zones within a single region. > > To keep things simple I currently have all 12 nodes listed as seeds. That > seems like overkill to me, but I don't know the pros and cons of too many > or too few seeds. >
Seeds are used for bootstrapping a new node so it can discover the others. Existing nodes store a list of the other nodes it has seen so doesn't need the seeds each time it starts up. Seeds are treated slightly differently in gossip though to ensure a node keeps trying to connect to seeds in case of a partition. The best recommendations are to use the same seed list on each node and just a few. More than your replication factor is almost certainly too many, but the cost of too many is very little. Richard.