Hi,
It's probably a strange question but I have a heavily read-optimized
payload where data integrity is not a big deal. So to keep latencies low I
am reading with Consistency ONE from my Multi-DC Cluster.

Now the issue I saw is that I needed to add another Cassandra node (for
redundancy reasons).
Since I want this for renduncancy I booted the node and then changed the
Replication of my Keyspace to include the new node (all nodes have 100% of
the data).

The issue I was seeing is that clients that connected to the new Node
afterwards were seeing incomplete data - so the Key would already be
present, but the columns would all be null values.
I expect this to die down once the node is fully replicated, but in the
meantime a lot of my connected clients were in trouble. (The application
can handle seeing old data - incomplete is another matter all together)

The total data in question is a negligible 500kb (so nothing that should
really take any amount of time in my opinion but it took a few minutes for
the data to replicate over and I am still not sure everything is replicated
correctly).

Increasing the RF to something higher won't really help as the setup is
dc1: 3; dc2: 2 (I added the second node in dc2). So a LOCAL_QUORUM in dc2
would still be 2 nodes which means I just can't loose either of them.
Adding a third node is not really cost effective for the current workloads
these nodes need to handle.

Any advice on how to avoid this in the future? Is there a way to start up a
node that does not serve client requests but does replicate data?

greetings Daniel

Reply via email to