sstableloader: How much does it actually need?

2020-02-05 Thread Voytek Jarnot
Scenario: Cassandra 3.11.x, 4 nodes, RF=3; moving to identically-sized cluster via snapshots and sstableloader. As far as I can tell, in the topology given above, any 2 nodes contain all of the data. In terms of migrating this cluster, would there be any downsides or risks with snapshotting and lo

ApacheCon Cassandra and NGCC 2020 Call for proposals

2020-02-05 Thread Nate McCall
I am delighted to share with you that we, the Apache Cassandra community, in light of our success at last year at last year's conference, have been given a three day track at this year's ApacheCon in New Orleans, LA, USA [0]. The goal of this track is simple: we are going to get together to talk a

Re: sstableloader: How much does it actually need?

2020-02-05 Thread Erick Ramirez
Unfortunately, there isn't a guarantee that 2 nodes alone will have the full copy of data. I'd rather not say "it depends". 😁 TIP: If the nodes in the target cluster have identical tokens allocated, you can just do a straight copy of the sstables node-for-node then do nodetool refresh. If the targ

Re: sstableloader: How much does it actually need?

2020-02-05 Thread Sergio
Another option is the DSE-bulk loader but it will require to convert to csv/json (good option if you don't like to play with sstableloader and deal to get all the sstables from all the nodes) https://docs.datastax.com/en/dsbulk/doc/index.html Cheers Sergio Il giorno mer 5 feb 2020 alle ore 16:56

Re: sstableloader: How much does it actually need?

2020-02-05 Thread Dor Laor
Another option is to use the Spark migrator, it reads a source CQL cluster and writes to another. It has a validation stage that compares a full scan and reports the diff: https://github.com/scylladb/scylla-migrator There are many more ways to clone a cluster. My main recommendation is to 'optimiz

Re: sstableloader: How much does it actually need?

2020-02-05 Thread Erick Ramirez
> > Another option is the DSE-bulk loader but it will require to convert to > csv/json (good option if you don't like to play with sstableloader and deal > to get all the sstables from all the nodes) > https://docs.datastax.com/en/dsbulk/doc/index.html > Thanks, Sergio. The DataStax Bulk Loader wa

Nodes becoming unresponsive

2020-02-05 Thread Surbhi Gupta
Hi, We have noticed in a Cassandra Cluster , one of the node has 100% cpu utilization, using top we can see that cassandra process is showing futex_wait . We are on CentOS release 6.10 (Final) .As per below document the futex bug was on Centos 6.6 . https://support.datastax.com/hc/en-us/articles

Re: Nodes becoming unresponsive

2020-02-05 Thread Erick Ramirez
I wrote that article 5 years ago but I didn't think it would still be relevant today. 😁 Have you tried to do a thread dump to see which are the most dominant threads? That's the most effective way of troubleshooting high CPU situations. Cheers! >

Re: Nodes becoming unresponsive

2020-02-05 Thread Jeff Jirsa
The bug is in the kernel - it'd be worth looking at your specific kernel via `uname -a` just to confirm you're not somehow running an old kernel. If you're sure you're on a good kernel, then yea, thread inspection is your next step. https://github.com/aragozin/jvm-tools/blob/master/sjk-core/docs/TT

Re: Nodes becoming unresponsive

2020-02-05 Thread Erick Ramirez
Surbhi, just a *friendly* reminder that it's customary to reply back to the mailing list instead of emailing me directly so that everyone else in the list can participate. ☺ > I tried taking thread dump using kill -3 but it just came back and > no file generated. > How do you take the thread dum

Re: Nodes becoming unresponsive

2020-02-05 Thread Surbhi Gupta
Sure Eric... I tried strace as well ...