> But what if the gc_grace was changed to a lower value as part of a schema > migration after the hints have been marked with TTLs equal to the lower > gc_grace before the migration? There would be a chance then if the tombstones had been purged. Want to raise a ticket ?
Cheers ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 29/03/2013, at 2:58 AM, Arya Goudarzi <gouda...@gmail.com> wrote: > I am not familiar with that part of the code yet. But what if the gc_grace > was changed to a lower value as part of a schema migration after the hints > have been marked with TTLs equal to the lower gc_grace before the migration? > > From what you've described, I think this is not an issue for us as we did not > have a node down for a long period of time, but just pointing out what I > think could happen based on what you've described. > > On Sun, Mar 24, 2013 at 10:03 AM, aaron morton <aa...@thelastpickle.com> > wrote: >> I could imagine a scenario where a hint was replayed to a replica after all >> replicas had purged their tombstones > Scratch that, the hints are TTL'd with the lowest gc_grace. > Ticket closed https://issues.apache.org/jira/browse/CASSANDRA-5379 > > Cheers > > ----------------- > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 24/03/2013, at 6:24 AM, aaron morton <aa...@thelastpickle.com> wrote: > >>> Beside the joke, would hinted handoff really have any role in this issue? >> I could imagine a scenario where a hint was replayed to a replica after all >> replicas had purged their tombstones. That seems like a long shot, it would >> need one node to be down for the write and all up for the delete and for all >> of them to have purged the tombstone. But maybe we should have a max age on >> hints so it cannot happen. >> >> Created https://issues.apache.org/jira/browse/CASSANDRA-5379 >> >> Ensuring no hints are in place during an upgrade would work around. I tend >> to make sure hints and commit log are clear during an upgrade. >> >> Cheers >> >> ----------------- >> Aaron Morton >> Freelance Cassandra Consultant >> New Zealand >> >> @aaronmorton >> http://www.thelastpickle.com >> >> On 22/03/2013, at 7:54 AM, Arya Goudarzi <gouda...@gmail.com> wrote: >> >>> Beside the joke, would hinted handoff really have any role in this issue? I >>> have been struggling to reproduce this issue using the snapshot data taken >>> from our cluster and following the same upgrade process from 1.1.6 to >>> 1.1.10. I know snapshots only link to active SSTables. What if these >>> returned rows belong to some inactive SSTables and some bug exposed itself >>> and marked them as active? What are the possibilities that could lead to >>> this? I am eager to find our as this is blocking our upgrade. >>> >>> On Tue, Mar 19, 2013 at 2:11 AM, <moshe.kr...@barclays.com> wrote: >>> This obscure feature of Cassandra is called “haunted handoff”. >>> >>> >>> >>> Happy (early) April Fools J >>> >>> >>> >>> From: aaron morton [mailto:aa...@thelastpickle.com] >>> Sent: Monday, March 18, 2013 7:45 PM >>> To: user@cassandra.apache.org >>> Subject: Re: Lots of Deleted Rows Came back after upgrade 1.1.6 to 1.1.10 >>> >>> >>> >>> As you see, this node thinks lots of ranges are out of sync which shouldn't >>> be the case as successful repairs where done every night prior to the >>> upgrade. >>> >>> Could this be explained by writes occurring during the upgrade process ? >>> >>> >>> >>> I found this bug which touches timestamp and tomstones which was fixed in >>> 1.1.10 but am not 100% sure if it could be related to this issue: >>> https://issues.apache.org/jira/browse/CASSANDRA-5153 >>> >>> Me neither, but the issue was fixed in 1.1.0 >>> >>> >>> >>> It appears that the repair task that I executed after upgrade, brought >>> back lots of deleted rows into life. >>> >>> Was it entire rows or columns in a row? >>> >>> Do you know if row level or column level deletes were used ? >>> >>> >>> >>> Can you look at the data in cassanca-cli and confirm the timestamps on the >>> columns make sense ? >>> >>> >>> >>> Cheers >>> >>> >>> >>> ----------------- >>> >>> Aaron Morton >>> >>> Freelance Cassandra Consultant >>> >>> New Zealand >>> >>> >>> >>> @aaronmorton >>> >>> http://www.thelastpickle.com >>> >>> >>> >>> On 16/03/2013, at 2:31 PM, Arya Goudarzi <gouda...@gmail.com> wrote: >>> >>> >>> >>> >>> Hi, >>> >>> >>> >>> I have upgraded our test cluster from 1.1.6 to 1.1.10. Followed by running >>> repairs. It appears that the repair task that I executed after upgrade, >>> brought back lots of deleted rows into life. Here are some logistics: >>> >>> >>> >>> - The upgraded cluster started from 1.1.1 -> 1.1.2 -> 1.1.5 -> 1.1.6 >>> >>> - Old cluster: 4 node, C* 1.1.6 with RF3 using NetworkTopology; >>> >>> - Upgrade to : 1.1.10 with all other settings the same; >>> >>> - Successful repairs were being done on this cluster every night; >>> >>> - Our clients use nanosecond precision timestamp for cassandra calls; >>> >>> - After upgrade, while running repair I say some log messages like this in >>> one node: >>> >>> >>> >>> system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:54,847 >>> AntiEntropyService.java (line 1022) [repair >>> #0990f320-8da9-11e2-0000-e9b2bd8ea1bd] Endpoints /XX.194.60 and >>> /23.20.207.56 have 2223 range(s) out of sync for App >>> >>> system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:54,877 >>> AntiEntropyService.java (line 1022) [repair >>> #0990f320-8da9-11e2-0000-e9b2bd8ea1bd] Endpoints /XX.250.43 and >>> /23.20.207.56 have 161 range(s) out of sync for App >>> >>> system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:55,097 >>> AntiEntropyService.java (line 1022) [repair >>> #0990f320-8da9-11e2-0000-e9b2bd8ea1bd] Endpoints /XX.194.60 and >>> /23.20.250.43 have 2294 range(s) out of sync for App >>> >>> system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:59,190 >>> AntiEntropyService.java (line 789) [repair >>> #0990f320-8da9-11e2-0000-e9b2bd8ea1bd] App is fully synced (13 remaining >>> column family to sync for this session) >>> >>> >>> >>> As you see, this node thinks lots of ranges are out of sync which shouldn't >>> be the case as successful repairs where done every night prior to the >>> upgrade. >>> >>> >>> >>> The App CF uses SizeTiered with gc_grace of 10 days. It has caching = >>> 'ALL', and it is fairly small (11Mb on each node). >>> >>> >>> >>> I found this bug which touches timestamp and tomstones which was fixed in >>> 1.1.10 but am not 100% sure if it could be related to this issue: >>> https://issues.apache.org/jira/browse/CASSANDRA-5153 >>> >>> >>> >>> Any advice on how to dig deeper into this would be appreciated. >>> >>> >>> >>> Thanks, >>> >>> -Arya >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> >>> This message may contain information that is confidential or privileged. If >>> you are not an intended recipient of this message, please delete it and any >>> attachments, and notify the sender that you have received it in error. >>> Unless specifically stated in the message or otherwise indicated, you may >>> not duplicate, redistribute or forward this message or any portion thereof, >>> including any attachments, by any means to any other person, including any >>> retail investor or customer. This message is not a recommendation, advice, >>> offer or solicitation, to buy/sell any product or service, and is not an >>> official confirmation of any transaction. Any opinions presented are solely >>> those of the author and do not necessarily represent those of Barclays. >>> This message is subject to terms available at: >>> www.barclays.com/emaildisclaimer and, if received from Barclays' Sales or >>> Trading desk, the terms available at: >>> www.barclays.com/salesandtradingdisclaimer/. By messaging with Barclays you >>> consent to the foregoing. Barclays Bank PLC is a company registered in >>> England (number 1026167) with its registered office at 1 Churchill Place, >>> London, E14 5HP. This email may relate to or be sent from other members of >>> the Barclays group. >>> >>> _______________________________________________ >>> >>> >> > >