Thanks for sharing your experience Ben On 15 Sep 2016 11:35 am, "Ben Slater" <ben.sla...@instaclustr.com> wrote:
> We’ve successfully used the rsynch method you outline quite a few times in > situations where we’ve had clusters that take forever to add new nodes > (mainly due to secondary indexes) and need to do a quick replacement for > one reason or another. As you mention, the main disadvantage we ran into is > that the node doesn’t get cleaned up through the replacement process like a > newly streamed node does (plus the extra operational complexity). > > Cheers > Ben > > On Thu, 15 Sep 2016 at 19:47 Vasileios Vlachos <vasileiosvlac...@gmail.com> > wrote: > >> Hello and thanks for your responses, >> >> OK, so increasing stream_throughput_outbound_megabits_per_sec makes no >> difference. Any ideas why streaming is limited to only two of the three >> nodes available? >> >> As an alternative to slow streaming I tried this: >> >> - install C* on a new node, stop the service and delete >> /var/lib/cassandra/* >> - rsync /etc/cassandra from old node to new node >> - rsync /var/lib/cassandra from old node to new node >> - stop C* on the old node >> - rsync /var/lib/cassandra from old node to new node >> - move the old node to a different IP >> - move the new node to the old node's original IP >> - start C* on the new node (no need for the replace_node option in >> cassandra-env.sh) >> >> This technique has been successful so far for a demo cluster with fewer >> data. The only disadvantage for us is that we were hoping that by streaming >> the SSTables to the new node, tombstones would be discarded (freeing a lot >> of disk space on our live cluster). This is exactly what happened for the >> one node we streamed so far; unfortunately, the slow streaming generates a >> lot of hints which makes recovery a very long process. >> >> Do you guys see any other problems with the rsync method that I've >> skipped? >> >> Regarding the tombstones issue (if we finally do what I described above), >> I'm thinking sstablsplit. Then compaction should deal with it (I think). I >> have not used sstablesplit in the past, so another thing I'd like to ask is >> if you guys find this a good/bad idea for what I'm trying to do. >> >> Many thanks, >> Vasilis >> >> On Mon, Sep 12, 2016 at 6:42 PM, Jeff Jirsa <jji...@apache.org> wrote: >> >>> >>> >>> On 2016-09-12 09:38 (-0700), daemeon reiydelle <daeme...@gmail.com> >>> wrote: >>> > Re. throughput. That looks slow for jumbo with 10g. Check your >>> networks. >>> > >>> > >>> >>> It's extremely unlikely you'll be able to saturate a 10g link with a >>> single instance cassandra. >>> >>> Faster Cassandra streaming is a work in progress - being able to send >>> more than one file at a time is probably the most obvious area for >>> improvement, and being able to better deal with the CPU / garbage generated >>> on the receiving side is just behind that. You'll likely be able to stream >>> 10-15 MB/s per sending server or cpu core, whichever is less (in a vnode >>> setup, you'll be cpu bound - in a single-token setup, you'll be stream >>> bound). >>> >>> >>> >> -- > ———————— > Ben Slater > Chief Product Officer > Instaclustr: Cassandra + Spark - Managed | Consulting | Support > +61 437 929 798 >