> I think it is a serious problem since I can not "repair"..... I am > using cassandra on production servers. is there some way to fix it > without upgrade? I heard of that 0.8.x is still not quite ready in > production environment.
It is a serious issue if you really need to repair one CF at the time. However, looking at your original post it seems this is not necessarily your issue. Do you need to, or was your concern rather the overall time repair took? There are other things that are improved in 0.8 that affect 0.7. In particular, (1) in 0.7 compaction, including validating compactions that are part of repair, is non-concurrent so if your repair starts while there is a long-running compaction going it will have to wait, and (2) semi-related is that the merkle tree calculation that is part of repair/anti-entropy may happen "out of synch" if one of the nodes participating happen to be busy with compaction. This in turns causes additional data to be sent as part of repair. That might be why your immediately following repair took a long time, but it's difficult to tell. If you're having issues with repair and large data sets, I would generally say that upgrading to 0.8 is recommended. However, if you're on 0.7.4, beware of https://issues.apache.org/jira/browse/CASSANDRA-3166 -- / Peter Schuller (@scode on twitter)