I have this problem too, and I don't understand why.

I can repair my nodes very quickly by looping though all my data (when you
read your data it does read-repair), but nodetool repair takes forever. I
understand that nodetool repair builds merkle trees, etc. etc., so it's a
different algorithm, but why can't nodetool repair be smart enough to choose
the best algorithm? Also, I don't understand what's special about my data
that makes nodetool repair so much slower than looping through all my data.


On Wed, Jul 20, 2011 at 12:18 AM, Maxim Potekhin <potek...@bnl.gov> wrote:

> Thanks Edward. I'm told by our IT that the switch connecting the nodes is
> pretty fast.
> Seriously, in my house I copy complete DVD images from my bedroom to
> the living room downstairs via WiFi, and a dozen of GB does not seem like a
> problem, on dirt cheap hardware (Patriot Box Office).
>
> I also have just _one_ column major family but caveat emptor -- 8 indexes
> attached to
> it (and there will be more). There is one accounting CF which is small,
> can't possibly
> make a difference.
>
> By contrast, compaction (as in nodetool) performs quite well on this
> cluster. I start suspecting some
> sort of malfunction.
>
> Looked at the system log during the "repair", there is some compaction
> agent doing
> work that I'm not sure makes sense (and I didn't call for it). Disk
> utilization all of a sudden goes up to 40%
> per Ganglia, and stays there, this is pretty silly considering the cluster
> is IDLE and we have SSDs. No external writes,
> no reads. There are occasional GC stoppages, but these I can live with.
>
> This repair debacle happens 2nd time in a row. Cr@p. I need to go to
> production soon
> and that doesn't look good at all. If I can't manage a system that simple
> (and/or get help
> on this list) I may have to cut losses i.e. stay with Oracle.
>
> Regards,
>
> Maxim
>
>
>
>
> On 7/19/2011 12:16 PM, Edward Capriolo wrote:
>
>>
>> Well most SSD's are pretty fast. There is one more to consider. If
>> Cassandra determines nodes are out of sync it has to transfer data across
>> the network. If that is the case you have to look at 'nodetool streams' and
>> determine how much data is being transferred between nodes. There are some
>> open tickets where with larger tables repair is streaming more then it needs
>> to. But even if the transfers are only 10% of your 200GB. Transferring 20 GB
>> is not trivial.
>>
>> If you have multiple keyspaces and column families repair one at a time
>> might make the process more manageable.
>>
>
>

Reply via email to