Re: Running Large Clusters in Production

2020-07-13 Thread Isaac Reath (BLOOMBERG/ 919 3RD A)
Thanks for the info Jeff, all very helpful!

From: user@cassandra.apache.org At: 07/11/20 12:30:36To:  
user@cassandra.apache.org
Subject: Re: Running Large Clusters in Production

Gossip related stuff eventually becomes the issue

For example, when a new host joins the cluster (or replaces a failed host), the 
new bootstrapping tokens go into a “pending range” set. Writes then merge 
pending ranges with final ranges, and the data structures involved here weren’t 
necessarily designed for hundreds of thousands of ranges, so it’s likely they 
stop behaving at some point 
(https://issues.apache.org/jira/browse/CASSANDRA-6345 , 
https://issues.apache.org/jira/browse/CASSANDRA-6127   as an example, but there 
have been others)

Unrelated to vnodes, until cassandra 4.0, the internode messaging requires 
basically 6 threads per instance - 3 for ingress and 3 for egress, to every 
other host in the cluster. The full mesh gets pretty expensive, it was 
rewritten in 4.0 and that thousand number may go up quite a bit after that. 


On Jul 11, 2020, at 9:16 AM, Isaac Reath (BLOOMBERG/ 919 3RD A) 
 wrote:



Thank you John and Jeff, I was leaning towards sharding and this really helps 
support that opinion. Would you mind explaining a bit more what about vnodes 
caused those issues?

From: user@cassandra.apache.org At: 07/10/20 19:06:27To:  
user@cassandra.apache.org
Cc:  Isaac Reath (BLOOMBERG/ 919 3RD A ) 
Subject: Re: Running Large Clusters in Production

I worked on a handful of large clusters (> 200 nodes) using vnodes, and there 
were some serious issues with both performance and availability.  We had to put 
in a LOT of work to fix the problems.

I agree with Jeff - it's way better to manage multiple clusters than a really 
large one.


On Fri, Jul 10, 2020 at 2:49 PM Jeff Jirsa  wrote:

1000 instances are fine if you're not using vnodes.

I'm not sure what the limit is if you're using vnodes. 

If you might get to 1000, shard early before you get there. Running 8x100 host 
clusters will be easier than one 800 host cluster.


On Fri, Jul 10, 2020 at 2:19 PM Isaac Reath (BLOOMBERG/ 919 3RD A) 
 wrote:

Hi All,

I’m currently dealing with a use case that is running on around 200 nodes, due 
to growth of their product as well as onboarding additional data sources, we 
are looking at having to expand that to around 700 nodes, and potentially 
beyond to 1000+. To that end I have a couple of questions:

1)  For those who have experienced managing clusters at that scale, what types 
of operational challenges have you run into that you might not see when 
operating 100 node clusters? A couple that come to mind are version (especially 
major version) upgrades become a lot more risky as it no longer becomes 
feasible to do a blue / green style deployment of the database and backup & 
restore operations seem far more error prone as well for the same reason 
(having to do an in-place restore instead of being able to spin up a new 
cluster to restore to). 

2) Is there a cluster size beyond which sharding across multiple clusters 
becomes the recommended approach?

Thanks,
Isaac




RE: Running Large Clusters in Production

2020-07-13 Thread Durity, Sean R
I’m curious – is the scaling needed for the amount of data, the amount of user 
connections, throughput or what? I have a 200ish cluster, but it is primarily a 
disk space issue. When I can have (and administer) nodes with large disks, the 
cluster size will shrink.


Sean Durity

From: Isaac Reath (BLOOMBERG/ 919 3RD A) 
Sent: Monday, July 13, 2020 10:35 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Running Large Clusters in Production

Thanks for the info Jeff, all very helpful!
From: user@cassandra.apache.org At: 07/11/20 
12:30:36
To: user@cassandra.apache.org
Subject: Re: Running Large Clusters in Production

Gossip related stuff eventually becomes the issue

For example, when a new host joins the cluster (or replaces a failed host), the 
new bootstrapping tokens go into a “pending range” set. Writes then merge 
pending ranges with final ranges, and the data structures involved here weren’t 
necessarily designed for hundreds of thousands of ranges, so it’s likely they 
stop behaving at some point 
(https://issues.apache.org/jira/browse/CASSANDRA-6345 
[issues.apache.org]
 , https://issues.apache.org/jira/browse/CASSANDRA-6127 
[issues.apache.org]
   as an example, but there have been others)

Unrelated to vnodes, until cassandra 4.0, the internode messaging requires 
basically 6 threads per instance - 3 for ingress and 3 for egress, to every 
other host in the cluster. The full mesh gets pretty expensive, it was 
rewritten in 4.0 and that thousand number may go up quite a bit after that.


On Jul 11, 2020, at 9:16 AM, Isaac Reath (BLOOMBERG/ 919 3RD A) 
mailto:ire...@bloomberg.net>> wrote:

Thank you John and Jeff, I was leaning towards sharding and this really helps 
support that opinion. Would you mind explaining a bit more what about vnodes 
caused those issues?
From: user@cassandra.apache.org At: 07/10/20 
19:06:27
To: user@cassandra.apache.org
Cc: Isaac Reath (BLOOMBERG/ 919 3RD A )
Subject: Re: Running Large Clusters in Production

I worked on a handful of large clusters (> 200 nodes) using vnodes, and there 
were some serious issues with both performance and availability.  We had to put 
in a LOT of work to fix the problems.

I agree with Jeff - it's way better to manage multiple clusters than a really 
large one.


On Fri, Jul 10, 2020 at 2:49 PM Jeff Jirsa 
mailto:jji...@gmail.com>> wrote:
1000 instances are fine if you're not using vnodes.

I'm not sure what the limit is if you're using vnodes.

If you might get to 1000, shard early before you get there. Running 8x100 host 
clusters will be easier than one 800 host cluster.


On Fri, Jul 10, 2020 at 2:19 PM Isaac Reath (BLOOMBERG/ 919 3RD A) 
mailto:ire...@bloomberg.net>> wrote:
Hi All,

I’m currently dealing with a use case that is running on around 200 nodes, due 
to growth of their product as well as onboarding additional data sources, we 
are looking at having to expand that to around 700 nodes, and potentially 
beyond to 1000+. To that end I have a couple of questions:

1) For those who have experienced managing clusters at that scale, what types 
of operational challenges have you run into that you might not see when 
operating 100 node clusters? A couple that come to mind are version (especially 
major version) upgrades become a lot more risky as it no longer becomes 
feasible to do a blue / green style deployment of the database and backup & 
restore operations seem far more error prone as well for the same reason 
(having to do an in-place restore instead of being able to spin up a new 
cluster to restore to).

2) Is there a cluster size beyond which sharding across multiple clusters 
becomes the recommended approach?

Thanks,
Isaac






The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or

Remove from listserv

2020-07-13 Thread Marie-Anne
Remove me from Cassandra listserv.

maharkn...@comcast.net

 



Re: Upgrading cassandra cluster from 2.1 to 3.X when using custom TWCS

2020-07-13 Thread Gil Ganz
Jon, great advice, I ended up doing just that, if anyone else needs to do
the same thing, here is a 3.11.6 version

https://github.com/gilganz/twcs/tree/cassandra-3.11

Indeed switching between twcs strategies did not trigger any sstable change.

Jeff, nothing to be sorry about, gave me an opportunity to brush up my java
skills ;)

On Thu, Jul 9, 2020 at 7:59 PM Jeff Jirsa  wrote:

>
>
> On Thu, Jul 9, 2020 at 9:02 AM Oleksandr Shulgin <
> oleksandr.shul...@zalando.de> wrote:
>
>> On Thu, Jul 9, 2020 at 5:54 PM Gil Ganz  wrote:
>>
>> Another question, did changing the compaction strategy from one twcs to
>>> the other trigger merging of old sstables?
>>>
>>
>> I don't recall any unexpected action from changing the strategy, but of
>> course you should verify on a test system first if you have one.
>>
>>
> Should be exactly the same between the two, AFAIK.
>
> (Sorry my repo may not work on modern 3.0/3.11 branches.)
>


RE: Upgrading cassandra cluster from 2.1 to 3.X when using custom TWCS

2020-07-13 Thread Marie-Anne
How do I get off this listserv?

 

From: Gil Ganz [mailto:gilg...@gmail.com] 
Sent: Monday, July 13, 2020 11:30 AM
To: user@cassandra.apache.org
Subject: Re: Upgrading cassandra cluster from 2.1 to 3.X when using custom TWCS

 

Jon, great advice, I ended up doing just that, if anyone else needs to do the 
same thing, here is a 3.11.6 version

https://github.com/gilganz/twcs/tree/cassandra-3.11 

 

Indeed switching between twcs strategies did not trigger any sstable change.

Jeff, nothing to be sorry about, gave me an opportunity to brush up my java 
skills ;)

 

On Thu, Jul 9, 2020 at 7:59 PM Jeff Jirsa  wrote:

 

 

On Thu, Jul 9, 2020 at 9:02 AM Oleksandr Shulgin  
wrote:

On Thu, Jul 9, 2020 at 5:54 PM Gil Ganz  wrote:

 

Another question, did changing the compaction strategy from one twcs to the 
other trigger merging of old sstables?

 

I don't recall any unexpected action from changing the strategy, but of course 
you should verify on a test system first if you have one.

 

 

Should be exactly the same between the two, AFAIK. 

 
(Sorry my repo may not work on modern 3.0/3.11 branches.)



Re: Running Large Clusters in Production

2020-07-13 Thread Reid Pinchback
I don’t know if it’s the OPs intent in this case, but the response latency 
profile will likely be different for two clusters equivalent in total storage 
but different in node count. Multiple reasons for that, but probably the 
biggest would be that you’re changing a divisor in I/O queuing statistics that 
matter to compaction-triggered dirty page flushes, and I’d expect you would see 
that in latencies.  Speculative retry stats to bounce past slow nodes busy with 
garbage collections might shift a bit too.

R

From: "Durity, Sean R" 
Reply-To: "user@cassandra.apache.org" 
Date: Monday, July 13, 2020 at 10:48 AM
To: "user@cassandra.apache.org" 
Subject: RE: Running Large Clusters in Production

Message from External Sender
I’m curious – is the scaling needed for the amount of data, the amount of user 
connections, throughput or what? I have a 200ish cluster, but it is primarily a 
disk space issue. When I can have (and administer) nodes with large disks, the 
cluster size will shrink.


Sean Durity

From: Isaac Reath (BLOOMBERG/ 919 3RD A) 
Sent: Monday, July 13, 2020 10:35 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Running Large Clusters in Production

Thanks for the info Jeff, all very helpful!
From: user@cassandra.apache.org At: 07/11/20 
12:30:36
To: user@cassandra.apache.org
Subject: Re: Running Large Clusters in Production

Gossip related stuff eventually becomes the issue

For example, when a new host joins the cluster (or replaces a failed host), the 
new bootstrapping tokens go into a “pending range” set. Writes then merge 
pending ranges with final ranges, and the data structures involved here weren’t 
necessarily designed for hundreds of thousands of ranges, so it’s likely they 
stop behaving at some point 
(https://issues.apache.org/jira/browse/CASSANDRA-6345 
[issues.apache.org]
 , https://issues.apache.org/jira/browse/CASSANDRA-6127 
[issues.apache.org]
   as an example, but there have been others)

Unrelated to vnodes, until cassandra 4.0, the internode messaging requires 
basically 6 threads per instance - 3 for ingress and 3 for egress, to every 
other host in the cluster. The full mesh gets pretty expensive, it was 
rewritten in 4.0 and that thousand number may go up quite a bit after that.



On Jul 11, 2020, at 9:16 AM, Isaac Reath (BLOOMBERG/ 919 3RD A) 
mailto:ire...@bloomberg.net>> wrote:
Thank you John and Jeff, I was leaning towards sharding and this really helps 
support that opinion. Would you mind explaining a bit more what about vnodes 
caused those issues?
From: user@cassandra.apache.org At: 07/10/20 
19:06:27
To: user@cassandra.apache.org
Cc: Isaac Reath (BLOOMBERG/ 919 3RD A )
Subject: Re: Running Large Clusters in Production

I worked on a handful of large clusters (> 200 nodes) using vnodes, and there 
were some serious issues with both performance and availability.  We had to put 
in a LOT of work to fix the problems.

I agree with Jeff - it's way better to manage multiple clusters than a really 
large one.


On Fri, Jul 10, 2020 at 2:49 PM Jeff Jirsa 
mailto:jji...@gmail.com>> wrote:
1000 instances are fine if you're not using vnodes.

I'm not sure what the limit is if you're using vnodes.

If you might get to 1000, shard early before you get there. Running 8x100 host 
clusters will be easier than one 800 host cluster.


On Fri, Jul 10, 2020 at 2:19 PM Isaac Reath (BLOOMBERG/ 919 3RD A) 
mailto:ire...@bloomberg.net>> wrote:
Hi All,

I’m currently dealing with a use case that is running on around 200 nodes, due 
to growth of their product as well as onboarding additional data sources, we 
are looking at having to expand that to around 700 nodes, and potentially 
beyond to 1000+. To that end I have a couple of questions:

1) For those who have experienced managing clusters at that scale, what types 
of operational challenges have you run into that you might not see when 
operating 100 node clusters? A couple that come to mind are version (especially 
major version) upgrades become a lot more risky as it no longer becomes 
feasible to do a blue / green style deployment of the database and backup & 
restore operations seem far more error prone as well for the same reason 
(having to do an in-place restore instead of being able to spin up a new 
cluster to restore to).

2) Is there a cluster size beyond which sharding across multiple clusters 
becomes the recommended approach?

Thanks,
Isaac






The information in this Internet Email is confidential and may be le

Cqlsh copy command on a larger data set

2020-07-13 Thread Jai Bheemsen Rao Dhanwada
Hello,

I would like to copy some data from one cassandra cluster to another
cassandra cluster using the CQLSH copy command. Is this the good approach
if the dataset size on the source cluster is very high(500G - 1TB)? If not
what is the safe approach? and are there any limitations/known issues to
keep in mind before attempting this?


Re: Cqlsh copy command on a larger data set

2020-07-13 Thread Kiran mk
I wouldn't say it's good approach for that size.  But you can try dsbulk
approach too.

Try to split output into multiple files.

Best Regards,
Kiran M K

On Tue, Jul 14, 2020, 5:17 AM Jai Bheemsen Rao Dhanwada <
jaibheem...@gmail.com> wrote:

> Hello,
>
> I would like to copy some data from one cassandra cluster to another
> cassandra cluster using the CQLSH copy command. Is this the good approach
> if the dataset size on the source cluster is very high(500G - 1TB)? If not
> what is the safe approach? and are there any limitations/known issues to
> keep in mind before attempting this?
>


Relation between num_tokens and cluster extend limitations

2020-07-13 Thread onmstester onmstester
Hi, 



I'm using allocate_tokens_for_keyspace and num_tokens=32 and i wan't to extend 
the size of some clusters.

I read in articles that for num_tokens=4, one should add more 25% of cluster 
size for the cluster to become balanced again.



1. For example, with num_tokens=4 and already have 16 nodes, so i should add 4 
nodes to be balanced again and my cluster would be in dangerous and unbalanced 
state until i add the 4th node?

Or should i add all 4 nodes at once (i don't know how because i should wait for 
each node to stream in multiple TBs)?



2. with num_tokens=32 for these different scenarios of cluster size, how many 
nodes should i add to be balanced again?

a. cluster size = 16

b. cluster size = 32

c. cluster size = 70



3. I can not understand why cassandra could not keep cluster balanced when you 
add/remove a single node to the cluster, how the mechanism works?

4. Is the limitation the same with the shrink?



Thank you in advance

design principle to manage roll back

2020-07-13 Thread Manu Chadha
Hi

What are the design approaches I can follow to ensure that data is consistent 
from an application perspective (not from individual tables perspective). I am 
thinking of issues which arise due to unavailability of rollback or executing 
atomic transactions in Cassandra. Is Cassandra not suitable for my project?

Cassandra recommends creating a new table for each query. This results in data 
duplication (which doesn’t bother me). Take the following scenario. An 
application which allows users to create, share and manage food recipes. Each 
of the function below adds records in a separate database


for {savedRecipe <- saveInRecipeRepository(...)

   recipeTagRepository <- saveRecipeTag(...)
   partitionInfoOfRecipes <- savePartitionOfTheTag(...)
   updatedUserProfile <- updateInUserProfile(...)
   recipesByUser <- saveRecipesCreatedByUser(...)
   supportedRecipes <- updateSupportedRecipesInformation(tag)}

If say updateInUserProfile fails, then I'll have to manage rollback in the 
application itself as Cassandra doesn’t do it. My concerns is that the rollback 
process could itself fail due to network issues say.

Is there a recommended way or a design principle I can follow to keep data 
consistent?

Thanks
Manu

Sent from Mail for Windows 10