sstableloader: How much does it actually need?

2020-02-05 Thread Voytek Jarnot
Scenario: Cassandra 3.11.x, 4 nodes, RF=3; moving to identically-sized
cluster via snapshots and sstableloader.

As far as I can tell, in the topology given above, any 2 nodes contain all
of the data. In terms of migrating this cluster, would there be any
downsides or risks with snapshotting and loading (sstableloader) only 2 of
the nodes rather than all 4?

Apologies for the spate of hypotheticals lately, this project is making
life interesting.

Thanks,
Voytek Jarnot


ApacheCon Cassandra and NGCC 2020 Call for proposals

2020-02-05 Thread Nate McCall
I am delighted to share with you that we, the Apache Cassandra community,
in light of our success at last year at last year's conference, have been
given a three day track at this year's ApacheCon in New Orleans, LA, USA
[0].

The goal of this track is simple: we are going to get together to talk
about Apache Cassandra. As such, this will be the ideal place to network
with peers, ask questions, get answers, etc.

On day one, we will be having our Next Generation Cassandra Conference
(NGCC). All are welcome to attend but this day is targeted for Apache
Cassandra committers, contributors and large-scale cluster operators to get
together and discuss topics of interest to them for future development
efforts. The content will focus on internals and will be geared towards
folks with knowledge of the codebase and/or operating Cassandra in very
large environments. Talk submissions for NGCC should take this target
audience into account.

Days two and three will be more general purpose and accessible for a wider
audience. If you are interested in speaking here, put something together
that tells a story others will want to hear. What we are looking for is
general use case submissions that our users will find interesting. This can
be how you solved a specific problem or just a general picture into how
your organization uses Apache Cassandra. A good submission will embrace the
open source ethos of sharing information to help others solve similar
problems.

NGCC talks will be targeted to 30 minutes with 15 minutes for questions or
small break out discussions. General purpose talks will have 50 minutes
with five minutes for questions.

For more information, including details of how to submit proposals, please
see this page:
https://acna2020.jamhosted.net

Please indicate "Cassandra" as the category and add NGCC at the top of the
"Proposal abstract" text box if you are submitting an NGCC talk.

If you are interested in helping organize, plan, and review submissions for
the Cassandra track, we'll send additional details out closer to the CFP
deadline about how you can be involved.

[0] https://www.apachecon.com/acna2020/


Re: sstableloader: How much does it actually need?

2020-02-05 Thread Erick Ramirez
Unfortunately, there isn't a guarantee that 2 nodes alone will have the
full copy of data. I'd rather not say "it depends". 😁

TIP: If the nodes in the target cluster have identical tokens allocated,
you can just do a straight copy of the sstables node-for-node then do nodetool
refresh. If the target cluster is already built and you can't assign the
same tokens then sstableloader is your only option. Cheers!

P.S. No need to apologise for asking questions. That's what we're all here
for. Just keep them coming. 👍

>


Re: sstableloader: How much does it actually need?

2020-02-05 Thread Sergio
Another option is the DSE-bulk loader but it will require to convert to
csv/json (good option if you don't like to play with sstableloader and deal
to get all the sstables from all the nodes)
https://docs.datastax.com/en/dsbulk/doc/index.html

Cheers

Sergio

Il giorno mer 5 feb 2020 alle ore 16:56 Erick Ramirez 
ha scritto:

> Unfortunately, there isn't a guarantee that 2 nodes alone will have the
> full copy of data. I'd rather not say "it depends". 😁
>
> TIP: If the nodes in the target cluster have identical tokens allocated,
> you can just do a straight copy of the sstables node-for-node then do nodetool
> refresh. If the target cluster is already built and you can't assign the
> same tokens then sstableloader is your only option. Cheers!
>
> P.S. No need to apologise for asking questions. That's what we're all here
> for. Just keep them coming. 👍
>
>>


Re: sstableloader: How much does it actually need?

2020-02-05 Thread Dor Laor
Another option is to use the Spark migrator, it reads a source CQL cluster and
writes to another. It has a validation stage that compares a full scan
and reports the diff:
https://github.com/scylladb/scylla-migrator

There are many more ways to clone a cluster. My main recommendation is
to 'optimize'
for correctness and simplicity first and only last optimize for
performance/time. Eventually
machine time for such rare operation is cheap, engineering time is
expensive and data
inconsistency is priceless..

On Wed, Feb 5, 2020 at 5:24 PM Sergio  wrote:
>
> Another option is the DSE-bulk loader but it will require to convert to 
> csv/json (good option if you don't like to play with sstableloader and deal 
> to get all the sstables from all the nodes)
> https://docs.datastax.com/en/dsbulk/doc/index.html
>
> Cheers
>
> Sergio
>
> Il giorno mer 5 feb 2020 alle ore 16:56 Erick Ramirez  
> ha scritto:
>>
>> Unfortunately, there isn't a guarantee that 2 nodes alone will have the full 
>> copy of data. I'd rather not say "it depends".
>>
>> TIP: If the nodes in the target cluster have identical tokens allocated, you 
>> can just do a straight copy of the sstables node-for-node then do nodetool 
>> refresh. If the target cluster is already built and you can't assign the 
>> same tokens then sstableloader is your only option. Cheers!
>>
>> P.S. No need to apologise for asking questions. That's what we're all here 
>> for. Just keep them coming.

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: sstableloader: How much does it actually need?

2020-02-05 Thread Erick Ramirez
>
> Another option is the DSE-bulk loader but it will require to convert to
> csv/json (good option if you don't like to play with sstableloader and deal
> to get all the sstables from all the nodes)
> https://docs.datastax.com/en/dsbulk/doc/index.html
>

Thanks, Sergio. The DataStax Bulk Loader was developed for a completely
different use case. It doesn't really make sense to go through trouble of
converting the SSTables to CSV/JSON when you've already got the SSTables to
begin with. ☺

It was really designed for loading/unloading data from non-C* sources as a
replacement for the COPY command. Cheers!


Nodes becoming unresponsive

2020-02-05 Thread Surbhi Gupta
Hi,

We have noticed in a Cassandra Cluster , one of the node has 100% cpu
utilization, using top we can see that cassandra process is showing
futex_wait .

We are on CentOS release 6.10 (Final)  .As per below document the futex bug
was on Centos 6.6 .
https://support.datastax.com/hc/en-us/articles/206259833-Nodes-appear-unresponsive-due-to-a-Linux-futex-wait-kernel-bug

Below are the installed patches.

sudo rpm -q --changelog kernel-`uname -r` | grep futex | grep ref

- [kernel] futex: Mention key referencing differences between shared and
private futexes (Larry Woodman) [1167405]

- [kernel] futex: Ensure get_futex_key_refs() always implies a barrier
(Larry Woodman) [1167405]

- [kernel] futex: Fix errors in nested key ref-counting (Denys Vlasenko)
[1094458] {CVE-2014-0205}

- [kernel] futex_lock_pi() key refcnt fix (Danny Feng) [566347]
{CVE-2010-0623}

top - 21:23:34 up 93 days, 10:43,  1 user,  load average: 137.35, 147.74,
148.52

Tasks: 658 total,   1 running, 657 sleeping,   0 stopped,   0 zombie

Cpu(s): 93.9%us,  1.9%sy,  2.0%ni,  2.0%id,  0.0%wa,  0.0%hi,  0.2%si,
0.0%st

Mem:  132236016k total, 129681568k used,  2554448k free,   215888k buffers

Swap:0k total,0k used,0k free, 93679880k cached


   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  WCHAN
COMMAND


  7725 cassandr  20   0  258g  40g  13g S 2302.0 32.4 305169:26 futex_wai
java


 69075 logstash  39  19 10.5g 1.5g  14m S 42.1  1.2   6763:00 futex_wai
java


 30008 root  20   0  465m  55m  11m S 11.5  0.0   0:02.78 poll_sche
TaniumClient


 31785 cassandr  20   0 34.9g  31m  10m S  4.9  0.0   0:00.15 futex_wai
java


  5154 root  20   0 1523m 6260 1300 S  3.0  0.0   1073:05 hrtimer_n
collectd


  1129 root  20   0 000 S  1.3  0.0 294:55.87 kjournald
jbd2/dm-0-8


 64173 root  20   0 1512m  71m  13m S  1.3  0.1   0:55.69 futex_wai
TaniumClient

Any idea , what else can be looked for high CPU issue?

Thanks
Surbhi


Re: Nodes becoming unresponsive

2020-02-05 Thread Erick Ramirez
I wrote that article 5 years ago but I didn't think it would still be
relevant today. 😁

Have you tried to do a thread dump to see which are the most dominant
threads? That's the most effective way of troubleshooting high CPU
situations. Cheers!

>


Re: Nodes becoming unresponsive

2020-02-05 Thread Jeff Jirsa
The bug is in the kernel - it'd be worth looking at your specific kernel
via `uname -a` just to confirm you're not somehow running an old kernel. If
you're sure you're on a good kernel, then yea, thread inspection is your
next step.
https://github.com/aragozin/jvm-tools/blob/master/sjk-core/docs/TTOP.md ,
for example.



On Wed, Feb 5, 2020 at 8:37 PM Erick Ramirez  wrote:

> I wrote that article 5 years ago but I didn't think it would still be
> relevant today. 😁
>
> Have you tried to do a thread dump to see which are the most dominant
> threads? That's the most effective way of troubleshooting high CPU
> situations. Cheers!
>
>>


Re: Nodes becoming unresponsive

2020-02-05 Thread Erick Ramirez
Surbhi, just a *friendly* reminder that it's customary to reply back to the
mailing list instead of emailing me directly so that everyone else in the
list can participate. ☺


> I tried taking thread dump using kill -3  but it just came back and
> no file generated.
> How do you take the thread dump?


The Swiss Java Knife (SJK) which Jeff referenced previously is a nice
utility for it:

... then yea, thread inspection is your next step.
> https://github.com/aragozin/jvm-tools/blob/master/sjk-core/docs/TTOP.md ,
> for example.
>
>>


Re: Nodes becoming unresponsive

2020-02-05 Thread Surbhi Gupta
Sure Eric...

I tried strace as well ...