On Tue, Sep 13, 2011 at 3:57 PM, Peter Schuller <peter.schul...@infidyne.com > wrote:
> > I think it is a serious problem since I can not "repair"..... I am > > using cassandra on production servers. is there some way to fix it > > without upgrade? I heard of that 0.8.x is still not quite ready in > > production environment. > > It is a serious issue if you really need to repair one CF at the time. > Why is it serious to do repair one CF at a time, if I cannot do that it at a CF level, then does it mean that I cannot use more than 50% disk space? Is this specific to this problem or is that a general statement? I ask because I am planning on doing this so I can limit the max disk overhead to be a CF (+ some factor) worth. I am going to be testing this in the next couple of weeks or so. > However, looking at your original post it seems this is not > necessarily your issue. Do you need to, or was your concern rather the > overall time repair took? > > There are other things that are improved in 0.8 that affect 0.7. In > particular, (1) in 0.7 compaction, including validating compactions > that are part of repair, is non-concurrent so if your repair starts > while there is a long-running compaction going it will have to wait, > and (2) semi-related is that the merkle tree calculation that is part > of repair/anti-entropy may happen "out of synch" if one of the nodes > participating happen to be busy with compaction. This in turns causes > additional data to be sent as part of repair. > > That might be why your immediately following repair took a long time, > but it's difficult to tell. > > If you're having issues with repair and large data sets, I would > generally say that upgrading to 0.8 is recommended. However, if you're > on 0.7.4, beware of > https://issues.apache.org/jira/browse/CASSANDRA-3166 > > -- > / Peter Schuller (@scode on twitter) >