Thanks for sharing your experience Ben

On 15 Sep 2016 11:35 am, "Ben Slater" <ben.sla...@instaclustr.com> wrote:

> We’ve successfully used the rsynch method you outline quite a few times in
> situations where we’ve had clusters that take forever to add new nodes
> (mainly due to secondary indexes) and need to do a quick replacement for
> one reason or another. As you mention, the main disadvantage we ran into is
> that the node doesn’t get cleaned up through the replacement process like a
> newly streamed node does (plus the extra operational complexity).
>
> Cheers
> Ben
>
> On Thu, 15 Sep 2016 at 19:47 Vasileios Vlachos <vasileiosvlac...@gmail.com>
> wrote:
>
>> Hello and thanks for your responses,
>>
>> OK, so increasing stream_throughput_outbound_megabits_per_sec makes no
>> difference. Any ideas why streaming is limited to only two of the three
>> nodes available?
>>
>> As an alternative to slow streaming I tried this:
>>
>>   - install C* on a new node, stop the service and delete
>> /var/lib/cassandra/*
>>  - rsync /etc/cassandra from old node to new node
>>  - rsync /var/lib/cassandra from old node to new node
>>  - stop C* on the old node
>>  - rsync /var/lib/cassandra from old node to new node
>>  - move the old node to a different IP
>>  - move the new node to the old node's original IP
>>  - start C* on the new node (no need for the replace_node option in
>> cassandra-env.sh)
>>
>> This technique has been successful so far for a demo cluster with fewer
>> data. The only disadvantage for us is that we were hoping that by streaming
>> the SSTables to the new node, tombstones would be discarded (freeing a lot
>> of disk space on our live cluster). This is exactly what happened for the
>> one node we streamed so far; unfortunately, the slow streaming generates a
>> lot of hints which makes recovery a very long process.
>>
>> Do you guys see any other problems with the rsync method that I've
>> skipped?
>>
>> Regarding the tombstones issue (if we finally do what I described above),
>> I'm thinking sstablsplit. Then compaction should deal with it (I think). I
>> have not used sstablesplit in the past, so another thing I'd like to ask is
>> if you guys find this a good/bad idea for what I'm trying to do.
>>
>> Many thanks,
>> Vasilis
>>
>> On Mon, Sep 12, 2016 at 6:42 PM, Jeff Jirsa <jji...@apache.org> wrote:
>>
>>>
>>>
>>> On 2016-09-12 09:38 (-0700), daemeon reiydelle <daeme...@gmail.com>
>>> wrote:
>>> > Re. throughput. That looks slow for jumbo with 10g. Check your
>>> networks.
>>> >
>>> >
>>>
>>> It's extremely unlikely you'll be able to saturate a 10g link with a
>>> single instance cassandra.
>>>
>>> Faster Cassandra streaming is a work in progress - being able to send
>>> more than one file at a time is probably the most obvious area for
>>> improvement, and being able to better deal with the CPU / garbage generated
>>> on the receiving side is just behind that. You'll likely be able to stream
>>> 10-15 MB/s per sending server or cpu core, whichever is less (in a vnode
>>> setup, you'll be cpu bound - in a single-token setup, you'll be stream
>>> bound).
>>>
>>>
>>>
>> --
> ————————
> Ben Slater
> Chief Product Officer
> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
> +61 437 929 798
>

Reply via email to