Hi,
We were running a 3-node cluster of cassandra 0.6.13 with RF=3.
After we added a fourth node, keeping RF=3, some old data appeared in
the database.
As far as I understand this can only happen if nodetool repair wasn't
run for more than GCGraceSeconds.
Our GCGraceSeconds is set to the default of 10 days (864000 seconds).
We have a scheduled cronjob to run repair once each week on every node,
each on another day.
I'm sure that none of the nodes ever skipped running a repair.
We don't run compact on the nodes explicitly as I understand that
running repair will trigger a
major compaction. I'm not entirely sure if it does so, but in any case
the tombstones will be removed by a minor
compaction. So I expected that the reappearing data, which is a couple
of months old in some cases, was long gone
by the time we added the node.
Can anyone think of any reason why the old data reappeared?
Stefan