from ticket 2818:
"One (reasonably simple) proposition to fix this would be to have repair schedule validation compactions across nodes one by one (ie, one CF/range at a time), waiting for all nodes to return their tree before submitting the next request. Then on each node, we should make sure that the node will start the validation compaction as soon as requested. For that, we probably want to have a specific executor for validation compaction"

.. This was the way I thought repair worked.

Anyway, in our case, we only have one CF, so I'm not sure if both issues apply to my situation.

Thanks. Looking forward to the release where these 2 things are fixed.

On , Jonathan Ellis <jbel...@gmail.com> wrote:
On Thu, Jul 21, 2011 at 9:14 AM, Jonathan Colby

jonathan.co...@gmail.com> wrote:

> I regularly run repair on my cassandra cluster. However, I often seen that during the repair operation very large amounts of data are transferred to other nodes.



https://issues.apache.org/jira/browse/CASSANDRA-2280

https://issues.apache.org/jira/browse/CASSANDRA-2816



> My questions is, if only some data is out of sync, why are entire Data files being transferred?



Repair streams ranges of files as a unit (which becomes a new file on

the target node) rather than using the normal write path.



--

Jonathan Ellis

Project Chair, Apache Cassandra

co-founder of DataStax, the source for professional Cassandra support

http://www.datastax.com

Reply via email to