RE: Database grows 10X bigger after running nodetool repair

2011-05-25 Thread Or Offer
Hi Everyone , We looking for help with upgrading our Cassandra from 0.6 to 0.7.2 here in Israel. If there is anyone hear that can help out with consulting, please email me. Thanks Or Offer SimilarGroup or.of...@similargroup.com

Re: Database grows 10X bigger after running nodetool repair

2011-05-25 Thread Daniel Doubleday
Firstly any ideas for a quick fix because this is giving me big production problems. Write/read with QUORUM is reportedly producing unpredictable results (people have called support regarding monsters in my MMO appearing and disappearing magically) and many operations are just failing with

Re: Database grows 10X bigger after running nodetool repair

2011-05-25 Thread Dominic Williams
Links for issue causing this: http://issues.apache.org/jira/browse/CASSANDRA-2670 http://issues.apache.org/jira/browse/CASSANDRA-2280 For anyone in this boat, my advice is:- 1. Do a rolling restart immediately, starting with the node you were running repair on. If you don't do this, the other nod

Re: Database grows 10X bigger after running nodetool repair

2011-05-25 Thread Dominic Williams
Jeepers creepers that's it Jeeves!!! Argh. Basically once my repair hit a big column family db size exploded until the node ran out of disk space.. Firstly any ideas for a quick fix because this is giving me big production problems. Write/read with QUORUM is reportedly producing unpredict

Re: Database grows 10X bigger after running nodetool repair

2011-05-25 Thread Daniel Doubleday
We are having problems with repair too. It sounds like yours are the same. From today: http://permalink.gmane.org/gmane.comp.db.cassandra.user/16619 On May 25, 2011, at 4:52 PM, Dominic Williams wrote: > Hi, > > I've got a strange problem, where the database on a node has inflated 10X > after

Re: Database grows 10X bigger after running nodetool repair

2011-05-25 Thread jonathan . colby
I'm not sure if this is the absolute best advice, but perhaps running "clean" on the data will help cleanup any data that isn't assigned to this token - in case you've moved the cluster around before. Any exceptions in the logs, eg EOF ? I experienced this and it caused the repairs to trip