2014-02-03 Robert Coli <rc...@eventbrite.com>: > On Mon, Feb 3, 2014 at 1:02 PM, olek.stas...@gmail.com > <olek.stas...@gmail.com> wrote: >> >> Today I've noticed that oldest files with broken values appear during >> repair (we do repair once a week on each node). Maybe it's the repair >> operation, which caused data loss? > > > Yes, unless you added or removed or replaced nodes, it would have to be the > repair operation, which streams SSTables. Did you run the repair during the > upgradesstables?
No, i've done repair after upgrade sstables. In fact it was about 4 weeks after, because of bug: https://issues.apache.org/jira/browse/CASSANDRA-6277. We upgrded cass to 2.0.2 and then after ca 1 month to 2.0.3 because of 6277. Then we were able to do repair, so I set up cron to do it weekly on each node. (it was about 10 dec 2013) the loss was discovered about new year's eve. > >> >> I've no idea. Currently our cluster >> is runing 2.0.3 version. > > > 2.0.3 has serious bugs, upgrade to 2.0.4 ASAP. OK > >> >> But our most crucial question is: can we recover loss, or should we >> start to think how to re-gather them? > > > If I were you, I would do the latter. You can to some extent recover them > via manual processes dumping with sstable2json and so forth, but it will be > quite painful. > > http://thelastpickle.com/2011/12/15/Anatomy-of-a-Cassandra-Partition/ > > Contains an explanation of how one could deal with it. Sorry, but I have to admit, that i can't transfer this solution to my problem. Could you briefly describe steps I should perform to recover? best regards Aleksander > > =Rob > > > >> >> best regards >> Aleksander >> ps. I like your link Rob, i'll pin it over my desk ;) In Oracle there >> were a rule: never deploy RDBMS before release 2 ;) >> >> 2014-02-03 Robert Coli <rc...@eventbrite.com>: >> > On Mon, Feb 3, 2014 at 12:51 AM, olek.stas...@gmail.com >> > <olek.stas...@gmail.com> wrote: >> >> >> >> We've faced very similar effect after upgrade from 1.1.7 to 2.0 (via >> >> 1.2.10). Probably after upgradesstable (but it's only a guess, >> >> because we noticed problem few weeks later), some rows became >> >> tombstoned. >> > >> > >> > To be clear, you didn't run SSTableloader at all? If so, this is the >> > hypothetical case where normal streaming operations (replacing a node? >> > what >> > streaming did you do?) results in data loss... >> > >> > Also, CASSANDRA-6527 is a good reminder regarding the following : >> > >> > >> > https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ >> > >> > =Rob > >