Re: Getting all unique keys

2017-08-21 Thread Christophe Schmitz
Hi Avi, The spark-project documentation is quite good, as well as the spark-cassandra-connector github project, which contains some basic examples you can easily get inspired from. A few random advice you might find usefull: - You will want one spark worker on each node, and a spark master on eith

Limit on having number of nodes in C* cluster

2017-08-21 Thread techpyaasa .
Hi Is there any limit on having number of nodes in c* cluster. Right now we have c*-2.1.17 cluster with 3 DCs each DC with 3 groups & each group has 21 nodes. We wanted to increase the cluster capacity by adding 6 nodes per group as many of nodes disk usage crossed 65%. So just wanted to clarify

Re: Limit on having number of nodes in C* cluster

2017-08-21 Thread Vladimir Yudovin
Actually there are clusters of thousandths nodes: Some of the largest production deployments include Apple's, with over 75,000 nodes storing over 10 PB of data Best regards, Vladimir Yudovin, Winguzone - Cloud Cassandra Hosting On Mon, 21 Aug 2017 08:35:37 -0400 techpyaasa .

Cassandra 3.11 is compacting forever

2017-08-21 Thread Igor Leão
Hi there, I've been trying to upgrade a Cassandra 3.9 cluster to Cassandra 3.11. Whenever I try to add a new Cassandra 3.11 node to the main datacenter, using `-Dcassandra.force_3_0_protocol_version=true` on the new node, this new node uses almost 100% of its CPU. Checking `nodetool compactionstat

Re: Moving all LCS SSTables to a repaired state

2017-08-21 Thread Sotirios Delimanolis
We are on 2.2.11 so we're all right on that front. The advice is difficult to implement unfortunately, with so many nodes. Thanks for the information!On Sunday, August 20, 2017, 4:28:36 PM PDT, kurt greaves wrote: Correction: Full repairs do mark SSTables as repaired in 2.2 (CASSANDRA-7586). M

Re: Limit on having number of nodes in C* cluster

2017-08-21 Thread techpyaasa .
Thanks lot for reply :) On Aug 21, 2017 6:44 PM, "Vladimir Yudovin" wrote: > Actually there are clusters of thousandths nodes: Some of the largest > production deployments include Apple's, with over 75,000 nodes storing over > 10 PB of data > > Best regards, Vladim

Re: Limit on having number of nodes in C* cluster

2017-08-21 Thread Jon Haddad
As far as I know, those 75K nodes are not in a single cluster. If memory serves correctly (and this article seems to indicate that it does http://www.techrepublic.com/article/apples-secret-nosql-sauce-includes-a-hefty-dose-of-cassandra/

Re: Moving all LCS SSTables to a repaired state

2017-08-21 Thread kurt greaves
Is there any specific reason you are trying to achieve this? It shouldn't really matter if you have a few SSTables in the unrepaired pool.​

Re: Cassandra 3.11 is compacting forever

2017-08-21 Thread kurt greaves
Why are you adding new nodes? If you're upgrading you should upgrade the existing nodes first and then add nodes. ​

Re: Moving all LCS SSTables to a repaired state

2017-08-21 Thread Sotirios Delimanolis
See my other email to this list that you replied to (I most recently replied late last week), titled "Cassandra isn't compacting old files". It's not just a few. It's tens/hundreds. I'm worried there's some "starvation" going on and disk is being filled with data that could be compacted away. 

Re: Limit on having number of nodes in C* cluster

2017-08-21 Thread Eduard Tudenhoefner
We've been doing successful testing with multi-DC setups and 500 nodes per DC. However, I agree with Jon here. Certain things are easier/faster with e.g. 5x100 node clusters than 1x500 node cluster. Cheers On Mon, Aug 21, 2017 at 10:16 AM, Jon Haddad wrote: > As far as I know, those 75K nodes a

Re: Cassandra isn't compacting old files

2017-08-21 Thread kurt greaves
Sorry about that Sotirios, I didn't make the connection between the two threads and this one dropped off my radar. > Is it possible that there's always a compaction to be run in the > "repaired" state, with that many SSTables, that unrepaired compactions are > essentially "starved", considering th

Re: Getting all unique keys

2017-08-21 Thread Avi Levi
Thanks Christophe, we will definitely consider that in the future. On Mon, Aug 21, 2017 at 3:01 PM, Christophe Schmitz < christo...@instaclustr.com> wrote: > Hi Avi, > > The spark-project documentation is quite good, as well as the > spark-cassandra-connector github project, which contains some b