> After we added a fourth node, keeping RF=3, some old data appeared in the > database. What CL are you working at ? (Should not matter too much with repair working, just asking)
> We don't run compact on the nodes explicitly as I understand that running > repair will trigger a > major compaction. I'm not entirely sure if it does so, but in any case the > tombstones will be removed by a minor > compaction. In 0.6.x tombstones were only purged during a major / manual compaction. Purging during minor compaction came in during 0.7 https://github.com/apache/cassandra/blob/trunk/CHANGES.txt#L1467 > Can anyone think of any reason why the old data reappeared? It sounds like you are doing things correctly. The complicating factor is 0.6 is so very old. If I wanted to poke around some more I would conduct reads as CL one against nodes and see if they return the "deleted" data or not. This would help me understand if the tombstone is still out there. I would also poke around a lot in the logs to make sure repair was running as expected and completing. If you find anything suspicious post examples. Finally I would ensure CL QUROUM was been used. Hope that helps. ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 6/03/2012, at 10:13 PM, Stefan Reek wrote: > Hi, > > We were running a 3-node cluster of cassandra 0.6.13 with RF=3. > After we added a fourth node, keeping RF=3, some old data appeared in the > database. > As far as I understand this can only happen if nodetool repair wasn't run for > more than GCGraceSeconds. > Our GCGraceSeconds is set to the default of 10 days (864000 seconds). > We have a scheduled cronjob to run repair once each week on every node, each > on another day. > I'm sure that none of the nodes ever skipped running a repair. > We don't run compact on the nodes explicitly as I understand that running > repair will trigger a > major compaction. I'm not entirely sure if it does so, but in any case the > tombstones will be removed by a minor > compaction. So I expected that the reappearing data, which is a couple of > months old in some cases, was long gone > by the time we added the node. > > Can anyone think of any reason why the old data reappeared? > > Stefan