sstableloader: How much does it actually need?
Scenario: Cassandra 3.11.x, 4 nodes, RF=3; moving to identically-sized cluster via snapshots and sstableloader. As far as I can tell, in the topology given above, any 2 nodes contain all of the data. In terms of migrating this cluster, would there be any downsides or risks with snapshotting and loading (sstableloader) only 2 of the nodes rather than all 4? Apologies for the spate of hypotheticals lately, this project is making life interesting. Thanks, Voytek Jarnot
ApacheCon Cassandra and NGCC 2020 Call for proposals
I am delighted to share with you that we, the Apache Cassandra community, in light of our success at last year at last year's conference, have been given a three day track at this year's ApacheCon in New Orleans, LA, USA [0]. The goal of this track is simple: we are going to get together to talk about Apache Cassandra. As such, this will be the ideal place to network with peers, ask questions, get answers, etc. On day one, we will be having our Next Generation Cassandra Conference (NGCC). All are welcome to attend but this day is targeted for Apache Cassandra committers, contributors and large-scale cluster operators to get together and discuss topics of interest to them for future development efforts. The content will focus on internals and will be geared towards folks with knowledge of the codebase and/or operating Cassandra in very large environments. Talk submissions for NGCC should take this target audience into account. Days two and three will be more general purpose and accessible for a wider audience. If you are interested in speaking here, put something together that tells a story others will want to hear. What we are looking for is general use case submissions that our users will find interesting. This can be how you solved a specific problem or just a general picture into how your organization uses Apache Cassandra. A good submission will embrace the open source ethos of sharing information to help others solve similar problems. NGCC talks will be targeted to 30 minutes with 15 minutes for questions or small break out discussions. General purpose talks will have 50 minutes with five minutes for questions. For more information, including details of how to submit proposals, please see this page: https://acna2020.jamhosted.net Please indicate "Cassandra" as the category and add NGCC at the top of the "Proposal abstract" text box if you are submitting an NGCC talk. If you are interested in helping organize, plan, and review submissions for the Cassandra track, we'll send additional details out closer to the CFP deadline about how you can be involved. [0] https://www.apachecon.com/acna2020/
Re: sstableloader: How much does it actually need?
Unfortunately, there isn't a guarantee that 2 nodes alone will have the full copy of data. I'd rather not say "it depends". 😁 TIP: If the nodes in the target cluster have identical tokens allocated, you can just do a straight copy of the sstables node-for-node then do nodetool refresh. If the target cluster is already built and you can't assign the same tokens then sstableloader is your only option. Cheers! P.S. No need to apologise for asking questions. That's what we're all here for. Just keep them coming. 👍 >
Re: sstableloader: How much does it actually need?
Another option is the DSE-bulk loader but it will require to convert to csv/json (good option if you don't like to play with sstableloader and deal to get all the sstables from all the nodes) https://docs.datastax.com/en/dsbulk/doc/index.html Cheers Sergio Il giorno mer 5 feb 2020 alle ore 16:56 Erick Ramirez ha scritto: > Unfortunately, there isn't a guarantee that 2 nodes alone will have the > full copy of data. I'd rather not say "it depends". 😁 > > TIP: If the nodes in the target cluster have identical tokens allocated, > you can just do a straight copy of the sstables node-for-node then do nodetool > refresh. If the target cluster is already built and you can't assign the > same tokens then sstableloader is your only option. Cheers! > > P.S. No need to apologise for asking questions. That's what we're all here > for. Just keep them coming. 👍 > >>
Re: sstableloader: How much does it actually need?
Another option is to use the Spark migrator, it reads a source CQL cluster and writes to another. It has a validation stage that compares a full scan and reports the diff: https://github.com/scylladb/scylla-migrator There are many more ways to clone a cluster. My main recommendation is to 'optimize' for correctness and simplicity first and only last optimize for performance/time. Eventually machine time for such rare operation is cheap, engineering time is expensive and data inconsistency is priceless.. On Wed, Feb 5, 2020 at 5:24 PM Sergio wrote: > > Another option is the DSE-bulk loader but it will require to convert to > csv/json (good option if you don't like to play with sstableloader and deal > to get all the sstables from all the nodes) > https://docs.datastax.com/en/dsbulk/doc/index.html > > Cheers > > Sergio > > Il giorno mer 5 feb 2020 alle ore 16:56 Erick Ramirez > ha scritto: >> >> Unfortunately, there isn't a guarantee that 2 nodes alone will have the full >> copy of data. I'd rather not say "it depends". >> >> TIP: If the nodes in the target cluster have identical tokens allocated, you >> can just do a straight copy of the sstables node-for-node then do nodetool >> refresh. If the target cluster is already built and you can't assign the >> same tokens then sstableloader is your only option. Cheers! >> >> P.S. No need to apologise for asking questions. That's what we're all here >> for. Just keep them coming. - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
Re: sstableloader: How much does it actually need?
> > Another option is the DSE-bulk loader but it will require to convert to > csv/json (good option if you don't like to play with sstableloader and deal > to get all the sstables from all the nodes) > https://docs.datastax.com/en/dsbulk/doc/index.html > Thanks, Sergio. The DataStax Bulk Loader was developed for a completely different use case. It doesn't really make sense to go through trouble of converting the SSTables to CSV/JSON when you've already got the SSTables to begin with. ☺ It was really designed for loading/unloading data from non-C* sources as a replacement for the COPY command. Cheers!
Nodes becoming unresponsive
Hi, We have noticed in a Cassandra Cluster , one of the node has 100% cpu utilization, using top we can see that cassandra process is showing futex_wait . We are on CentOS release 6.10 (Final) .As per below document the futex bug was on Centos 6.6 . https://support.datastax.com/hc/en-us/articles/206259833-Nodes-appear-unresponsive-due-to-a-Linux-futex-wait-kernel-bug Below are the installed patches. sudo rpm -q --changelog kernel-`uname -r` | grep futex | grep ref - [kernel] futex: Mention key referencing differences between shared and private futexes (Larry Woodman) [1167405] - [kernel] futex: Ensure get_futex_key_refs() always implies a barrier (Larry Woodman) [1167405] - [kernel] futex: Fix errors in nested key ref-counting (Denys Vlasenko) [1094458] {CVE-2014-0205} - [kernel] futex_lock_pi() key refcnt fix (Danny Feng) [566347] {CVE-2010-0623} top - 21:23:34 up 93 days, 10:43, 1 user, load average: 137.35, 147.74, 148.52 Tasks: 658 total, 1 running, 657 sleeping, 0 stopped, 0 zombie Cpu(s): 93.9%us, 1.9%sy, 2.0%ni, 2.0%id, 0.0%wa, 0.0%hi, 0.2%si, 0.0%st Mem: 132236016k total, 129681568k used, 2554448k free, 215888k buffers Swap:0k total,0k used,0k free, 93679880k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ WCHAN COMMAND 7725 cassandr 20 0 258g 40g 13g S 2302.0 32.4 305169:26 futex_wai java 69075 logstash 39 19 10.5g 1.5g 14m S 42.1 1.2 6763:00 futex_wai java 30008 root 20 0 465m 55m 11m S 11.5 0.0 0:02.78 poll_sche TaniumClient 31785 cassandr 20 0 34.9g 31m 10m S 4.9 0.0 0:00.15 futex_wai java 5154 root 20 0 1523m 6260 1300 S 3.0 0.0 1073:05 hrtimer_n collectd 1129 root 20 0 000 S 1.3 0.0 294:55.87 kjournald jbd2/dm-0-8 64173 root 20 0 1512m 71m 13m S 1.3 0.1 0:55.69 futex_wai TaniumClient Any idea , what else can be looked for high CPU issue? Thanks Surbhi
Re: Nodes becoming unresponsive
I wrote that article 5 years ago but I didn't think it would still be relevant today. 😁 Have you tried to do a thread dump to see which are the most dominant threads? That's the most effective way of troubleshooting high CPU situations. Cheers! >
Re: Nodes becoming unresponsive
The bug is in the kernel - it'd be worth looking at your specific kernel via `uname -a` just to confirm you're not somehow running an old kernel. If you're sure you're on a good kernel, then yea, thread inspection is your next step. https://github.com/aragozin/jvm-tools/blob/master/sjk-core/docs/TTOP.md , for example. On Wed, Feb 5, 2020 at 8:37 PM Erick Ramirez wrote: > I wrote that article 5 years ago but I didn't think it would still be > relevant today. 😁 > > Have you tried to do a thread dump to see which are the most dominant > threads? That's the most effective way of troubleshooting high CPU > situations. Cheers! > >>
Re: Nodes becoming unresponsive
Surbhi, just a *friendly* reminder that it's customary to reply back to the mailing list instead of emailing me directly so that everyone else in the list can participate. ☺ > I tried taking thread dump using kill -3 but it just came back and > no file generated. > How do you take the thread dump? The Swiss Java Knife (SJK) which Jeff referenced previously is a nice utility for it: ... then yea, thread inspection is your next step. > https://github.com/aragozin/jvm-tools/blob/master/sjk-core/docs/TTOP.md , > for example. > >>
Re: Nodes becoming unresponsive
Sure Eric... I tried strace as well ...