Yep, if you don't run cleanup on all nodes (except new node) after step x,
when you decommissioned node 4 and 5 later on, their tokens will be
reclaimed by the previous owner. Suddenly the data in those SSTables is now
live again because the token ownership has changed and any data in those
SStables will be returned.

Remember new nodes only add tokens to the ring, they don't affect other
nodes tokens, so if you remove those tokens everything goes back to how it
was before those nodes were added.

Adding a maker would be incredibly complicated. Plugs not really fit the
design of Cassandra. Here it's probably much easier to just follow
recommended procedure when adding and removing nodes.

On 16 Dec. 2017 01:37, "Python_Max" <python....@gmail.com> wrote:

Hello, Jeff.


Using your hint I was able to reproduce my situation on 5 VMs.
Simplified steps are:
1) set up 3-node cluster
2) create keyspace with RF=3 and table with gc_grace_seconds=60,
compaction_interval=10 and unchecked_tombstone_compaction=true (to force
compaction later)
3) insert 10..20 records with different partition and clustering keys
(consistency 'all')
4) 'nodetool flush' on all 3 nodes
5) add 4th node, add 5th node
6) using 'nodetool getendpoints' find key that moved to both 4th and 5th
node
7) delete that record from table (consistency 'all')
8) 'nodetool flush' on all 5 nodes, wait gc_grace_seconds, 'nodetool
compact' on nodes which responsible for that key, check that key and
tombstone gone using sstabledump
9) decommission 5th node, decommission 4th node
10) select data from table where key=key (consistency quorum)

And the row is here.

It sounds like bug in cassandra but since it is documented here
https://docs.datastax.com/en/cassandra/3.0/cassandra/operati
ons/opsAddNodeToCluster.html I suppose this counts as feature. It would be
better when data which stays in sstable after new node added would have
some marker and never returned as result to select query.

Thank you very much, Jeff, for pointing me in right direction.


On 13.12.17 18:43, Jeff Jirsa wrote:

> Did you run cleanup before you shrank the cluster?
>
>
-- 

Best Regards,
Python_Max.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Reply via email to