Re: Old data coming alive after adding node

Stefan Reek Tue, 06 Mar 2012 02:14:30 -0800

Hi Aaron,

Thanks for the quick reply.
All our writes/deletes are done with CL.QUORUM.

Our reads are done with CL.ONE. Although the reads that confirmed theold data were done with CL.QUORUM.According tohttps://svn.apache.org/viewvc/cassandra/branches/cassandra-0.6/CHANGES.txt0.6.6 has the same patchfor (CASSANDRA-1074) as 0.7 and so I assumed that minor compactions in0.6.6 and up also purged tombstones.The only suspicious thing I noticed was that after adding the fourthnode repairs became extremely slow and heavy.Running it degraded the performance of the whole cluster and the newnode even went OOM when running it.


Cheers,

Stefan

On 03/06/2012 10:51 AM, aaron morton wrote:

After we added a fourth node, keeping RF=3, some old data appeared inthe database.
What CL are you working at ? (Should not matter too much with repairworking, just asking)
We don't run compact on the nodes explicitly as I understand thatrunning repair will trigger amajor compaction. I'm not entirely sure if it does so, but in anycase the tombstones will be removed by a minor
compaction.
In 0.6.x tombstones were only purged during a major / manualcompaction. Purging during minor compaction came in during 0.7
https://github.com/apache/cassandra/blob/trunk/CHANGES.txt#L1467
Can anyone think of any reason why the old data reappeared?
It sounds like you are doing things correctly. The complicating factoris 0.6 is so very old.
If I wanted to poke around some more I would conduct reads as CL oneagainst nodes and see if they return the "deleted" data or not. Thiswould help me understand if the tombstone is still out there.
I would also poke around a lot in the logs to make sure repair wasrunning as expected and completing. If you find anything suspiciouspost examples.
Finally I would ensure CL QUROUM was been used.

Hope that helps.


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/03/2012, at 10:13 PM, Stefan Reek wrote:
Hi,

We were running a 3-node cluster of cassandra 0.6.13 with RF=3.
After we added a fourth node, keeping RF=3, some old data appeared inthe database.As far as I understand this can only happen if nodetool repair wasn'trun for more than GCGraceSeconds.
Our GCGraceSeconds is set to the default of 10 days (864000 seconds).
We have a scheduled cronjob to run repair once each week on everynode, each on another day.
I'm sure that none of the nodes ever skipped running a repair.
We don't run compact on the nodes explicitly as I understand thatrunning repair will trigger amajor compaction. I'm not entirely sure if it does so, but in anycase the tombstones will be removed by a minorcompaction. So I expected that the reappearing data, which is acouple of months old in some cases, was long gone
by the time we added the node.

Can anyone think of any reason why the old data reappeared?

Stefan

Re: Old data coming alive after adding node

Reply via email to