On Thu, Aug 3, 2017 at 9:33 AM, Daniel Hölbling-Inzko < daniel.hoelbling-in...@bitmovin.com> wrote:
> No I set Auto bootstrap to true and the node was UN in nodetool status but > when doing a select on the node with ONE I got incomplete data. > What I think is happening here is not related to the new node being added. When you increase Replication Factor, that does not automatically redistribute the existing data. It just makes other nodes responsible for portions of the data they might not really have yet. So I would expect that all your nodes show some inconsistencies, before you run a full repair of the ring. I can fairly easily reproduce it locally with ccm[1], 3 nodes, version 3.0.13. $ ccm status Cluster: 'v3013' ---------------- node1: UP node3: UP node2: UP $ ccm node1 cqlsh cqlsh> create keyspace test_rf WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': 1}; cqlsh> create table test_rf.t1(id int, data text, primary key(id)); cqlsh> insert into test_rf.t1(id, data) values(1, 'one'); cqlsh> select * from test_rf.t1; id | data ----+------ 1 | one (1 rows) At this point selecting from t1 works correctly on any of the nodes with the default CL=ONE. If we would now increase the RF and try reading again, something surprising will happen: cqlsh> alter keyspace test_rf WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': 2}; cqlsh> select * from test_rf.t1; id | data ----+------ (0 rows) And in my test this happens on all nodes at the same time. Explanation is fairly simple: now a different node is responsible for the data that was written to only one other node previously. A repair in this tiny test is trivial: cqlsh> CONSISTENCY ALL; cqlsh> select * from test_rf.t1; id | data ----+------ 1 | one (1 rows) And now the data can be read from any node again, since we did a "full repair". -- Alex [1] https://github.com/pcmanus/ccm