Re: Re: Repair question - why is so much data transferred?

jonathan . colby Thu, 21 Jul 2011 08:44:01 -0700

from ticket 2818:

"One (reasonably simple) proposition to fix this would be to have repairschedule validation compactions across nodes one by one (ie, one CF/rangeat a time), waiting for all nodes to return their tree before submittingthe next request. Then on each node, we should make sure that the node willstart the validation compaction as soon as requested. For that, we probablywant to have a specific executor for validation compaction"


.. This was the way I thought repair worked.

Anyway, in our case, we only have one CF, so I'm not sure if both issuesapply to my situation.


Thanks. Looking forward to the release where these 2 things are fixed.

On , Jonathan Ellis <jbel...@gmail.com> wrote:

On Thu, Jul 21, 2011 at 9:14 AM, Jonathan Colby

jonathan.co...@gmail.com> wrote:

> I regularly run repair on my cassandra cluster. However, I often seenthat during the repair operation very large amounts of data aretransferred to other nodes.

https://issues.apache.org/jira/browse/CASSANDRA-2280

https://issues.apache.org/jira/browse/CASSANDRA-2816

> My questions is, if only some data is out of sync, why are entire Datafiles being transferred?

Repair streams ranges of files as a unit (which becomes a new file on

the target node) rather than using the normal write path.

--

Jonathan Ellis

Project Chair, Apache Cassandra

co-founder of DataStax, the source for professional Cassandra support

http://www.datastax.com

Re: Re: Repair question - why is so much data transferred?

Reply via email to