On Fri, Feb 25, 2011 at 7:38 AM, Terje Marthinussen
<tmarthinus...@gmail.com> wrote:
>>
>> @Thibaut Britz
>> Caveat:Using simple strategy.
>> This works because cassandra scans data at startup and then serves
>> what it finds. For a join for example you can rsync all the data from
>> the node below/to the right of where the new node is joining. Then
>> join without bootstrap then cleanup both nodes. (also you have to
>> shutdown the first node so you do not have a lost write scenario in
>> the time between rsync and new node startup)
>>
>
> rsync all data from node to left/right..
> Wouldn't that mean that you need 2x the data to recover...?
> Terje

Terje,

In your scenario where you are never updating running repair becomes
less important. I have an alternative for you. I have a program I call
the "RescueRanger" we use it to range-scan all our data, find old
entries and then delete them. However if we set that program to "read
only mode" and tell it to read at CL.ALL, It becomes a program that
read repairs data!

This is a tradeoff. Range scanning though all your data is not fast,
but it does not require the extra disk space. Kinda like merge sort vs
bubble sort.

Reply via email to