Re: Repair Process Taking too long

Frank Ng Wed, 11 Apr 2012 21:06:38 -0700

Thank you for confirming that the per node data size is most likely causing
the long repair process.  I have tried a repair on smaller column families
and it was significantly faster.


On Wed, Apr 11, 2012 at 9:55 PM, aaron morton <aa...@thelastpickle.com>wrote:

> If you have 1TB of data it will take a long time to repair. Every bit of
> data has to be read and a hash generated. This is one of the reasons we
> often suggest that around 300 to 400Gb per node is a good load in the
> general case.
>
> Look at nodetool compactionstats .Is there a validation compaction running
> ? If so it is still building the merkle  hash tree.
>
> Look at nodetool netstats . Is it streaming data ? If so all hash trees
> have been calculated.
>
> Cheers
>
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 12/04/2012, at 2:16 AM, Frank Ng wrote:
>
> Can you expand further on your issue? Were you using Random Patitioner?
>
> thanks
>
> On Tue, Apr 10, 2012 at 5:35 PM, David Leimbach <leim...@gmail.com> wrote:
>
>> I had this happen when I had really poorly generated tokens for the ring.
>>  Cassandra seems to accept numbers that are too big.  You get hot spots
>> when you think you should be balanced and repair never ends (I think there
>> is a 48 hour timeout).
>>
>>
>> On Tuesday, April 10, 2012, Frank Ng wrote:
>>
>>> I am not using tier-sized compaction.
>>>
>>>
>>> On Tue, Apr 10, 2012 at 12:56 PM, Jonathan Rhone <rh...@tinyco.com>wrote:
>>>
>>>> Data size, number of nodes, RF?
>>>>
>>>> Are you using size-tiered compaction on any of the column families that
>>>> hold a lot of your data?
>>>>
>>>> Do your cassandra logs say you are streaming a lot of ranges?
>>>> zgrep -E "(Performing streaming repair|out of sync)"
>>>>
>>>>
>>>> On Tue, Apr 10, 2012 at 9:45 AM, Igor <i...@4friends.od.ua> wrote:
>>>>
>>>>>  On 04/10/2012 07:16 PM, Frank Ng wrote:
>>>>>
>>>>> Short answer - yes.
>>>>> But you are asking wrong question.
>>>>>
>>>>>
>>>>> I think both processes are taking a while.  When it starts up,
>>>>> netstats and compactionstats show nothing.  Anyone out there successfully
>>>>> using ext3 and their repair processes are faster than this?
>>>>>
>>>>>  On Tue, Apr 10, 2012 at 10:42 AM, Igor <i...@4friends.od.ua> wrote:
>>>>>
>>>>>> Hi
>>>>>>
>>>>>> You can check with nodetool  which part of repair process is slow -
>>>>>> network streams or verify compactions. use nodetool netstats or
>>>>>> compactionstats.
>>>>>>
>>>>>>
>>>>>> On 04/10/2012 05:16 PM, Frank Ng wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> I am on Cassandra 1.0.7.  My repair processes are taking over 30
>>>>>>> hours to complete.  Is it normal for the repair process to take this 
>>>>>>> long?
>>>>>>>  I wonder if it's because I am using the ext3 file system.
>>>>>>>
>>>>>>> thanks
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Jonathan Rhone
>>>> Software Engineer
>>>>
>>>> *TinyCo*
>>>> 800 Market St., Fl 6
>>>> San Francisco, CA 94102
>>>> www.tinyco.com
>>>>
>>>>
>>>
>
>

Re: Repair Process Taking too long

Reply via email to