Re: Balancing tokens over 2 datacenter

Walsh, Stephen Thu, 14 Apr 2016 05:14:41 -0700

Hi Alain,

If you look below (chain is getting long I know) but I mentioned that we are 
indeed using DCAwareRoundRobinPolicy


"We use the DCAwareRoundRobinPolicy in our java datastax driver in each DC 
application to point to that Cassandra DC’s."

Indeed it is a trade off having all data over all nodes, but this is to allow, 
one DC to go down or 2 nodes now in a single DC.
Just to insure maximum up time.

Im afraid the that all application are all reading from DC1, despite having a 
preferred read of DC2.
I believe this is because the primary tokens where created in DC1 - due to an 
initial miss-configuration when our application where first started and only 
used DC1 to create the keyspaces ad tables

Steve


From: Alain RODRIGUEZ <arodr...@gmail.com<mailto:arodr...@gmail.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Thursday, 14 April 2016 at 12:57
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Balancing tokens over 2 datacenter

100% ownership on all nodes isn’t wrong with 3 nodes in each of 2 Dcs with RF=3 
in both of those Dcs. That’s exactly what you’d expect it to be, and a 
perfectly viable production config for many workloads.

+1, no doubt about it. The only thing is all the nodes own the exact same data, 
meaning the data is replicated 6 times, once in each the 6 machines. Data is 
expensive but quite safe there, that's a tradeoff to consider, but it is ok 
from a Cassandra point of view, nothing "wrong" there.


We see all the writes are balanced (due to the replication factor) but all 
reads only go to DC1.
So with the configuration I believed confirmed :)

Any way to balance the primary tokens over the two DC’s? :)


Steve, I thought it was now ok.

Could you confirm this?

Are you using something like 'new DCAwareRoundRobinPolicy("DC1"));' as pointed 
in Bhuvan's link 
http://stackoverflow.com/questions/22813045/ability-to-write-to-a-particular-cassandra-node
 ? You can use some other

Then make sure to deploy this on clients on that need to use 'DC1' and 'new 
DCAwareRoundRobinPolicy("DC2")' on client that should be using 'DC2'.

Are your client using the 'DCAwareRoundRobinPolicy' and are the clients from 
the datacenter related to DC2, using 'new DCAwareRoundRobinPolicy("DC2")'?

This is really the only thing I can think about right now...

C*heers,
-----------------------
Alain Rodriguez - al...@thelastpickle.com<mailto:al...@thelastpickle.com>
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-04-14 11:43 GMT+02:00 Walsh, Stephen 
<stephen.wa...@aspect.com<mailto:stephen.wa...@aspect.com>>:
Thanks Guys,

I tend to agree that its a viable configuration, (but I’m biased)
We use datadog monitoring to view read writes per node,

We see all the writes are balanced (due to the replication factor) but all 
reads only go to DC1.
So with the configuration I believed confirmed :)

Any way to balance the primary tokens over the two DC’s? :)

Steve

From: Jeff Jirsa <jeff.ji...@crowdstrike.com<mailto:jeff.ji...@crowdstrike.com>>
Reply-To: <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Thursday, 14 April 2016 at 03:05

To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Balancing tokens over 2 datacenter

100% ownership on all nodes isn’t wrong with 3 nodes in each of 2 Dcs with RF=3 
in both of those Dcs. That’s exactly what you’d expect it to be, and a 
perfectly viable production config for many workloads.



From: Anuj Wadehra
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
Date: Wednesday, April 13, 2016 at 6:02 PM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
Subject: Re: Balancing tokens over 2 datacenter

Hi Stephen Walsh,

As per the nodetool output, every node owns 100% of the range. This indicates 
wrong configuration. It would be good, if you verify and share following 
properties of yaml on all nodes:

Num tokens,seeds, cluster name,listen address, initial token.

Also, which snitch are you using? If you use propertyfilesnitch, please share 
cassandra-topology.properties too.



Thanks
Anuj

Sent from Yahoo Mail on 
Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

On Wed, 13 Apr, 2016 at 9:46 PM, Walsh, Stephen
<stephen.wa...@aspect.com<mailto:stephen.wa...@aspect.com>> wrote:
Right again Alain
We use the DCAwareRoundRobinPolicy in our java datastax driver in each DC 
application to point to that Cassandra DC’s.



From: Alain RODRIGUEZ <arodr...@gmail.com>
Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Wednesday, 13 April 2016 at 15:52
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: Re: Balancing tokens over 2 datacenter

Steve,

This cluster looks just great.

Now, due to a miss configuration in our application, we saw that our 
application in both DC’s where pointing to DC1.

This is the only thing to solve, and it happens in the client side 
configuration.

What client do you use?

Are you using something like 'new DCAwareRoundRobinPolicy("DC1"));' as pointed 
in Bhuvan's link 
http://stackoverflow.com/questions/22813045/ability-to-write-to-a-particular-cassandra-node
 ? You can use some other

Then make sure to deploy this on clients on that need to use 'DC1' and 'new 
DCAwareRoundRobinPolicy("DC2")' on client that should be using 'DC2'.

Make sure ports are open.

This should be it,

C*heers,
-----------------------
Alain Rodriguez - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com



2016-04-13 16:28 GMT+02:00 Walsh, Stephen <stephen.wa...@aspect.com>:
Thanks for your helps guys,

As you guessed our schema is

{'class': 'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3'}  AND 
durable_writes = false;


Our reads and writes on LOCAL_ONE with each application (now) using its own DC 
as its preferred DC

Here is the nodetool status for one of our tables (all tabes are created the 
same way)


Datacenter: DC1

===============

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  Address     Load       Tokens  Owns (effective)  Host ID                    
           Rack

UN  X.0.0.149  14.6 MB    256     100.0%            
0f497235-a0bb-4e47-9434-dd0e126aa432  RAC3

UN  X.0.0.251  12.33 MB   256     100.0%            
a1307717-4b61-4d57-8658-50460d6d54a1  RAC1

UN  X.0.0.79   21.54 MB   256     100.0%            
f353c8f3-6b7c-483b-ad9a-3d66d469079e  RAC2

Datacenter: DC2

===============

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  Address     Load       Tokens  Owns (effective)  Host ID                    
           Rack

UN  X.0.2.32   18.08 MB   256     100.0%            
103a1cb3-6580-44bd-bf97-28ae160e1119  RAC6

UN  X.0.2.211  12.46 MB   256     100.0%            
8c8dd5ba-806d-43eb-9ee5-af463e443f46  RAC5

UN  X.0.2.186  12.58 MB   256     100.0%            
aef904ba-aaab-47f1-9bdc-cc1e0c676f61  RAC4


We ran the nodetool repair and cleanup in case the nodes where balanced but 
needed cleaning up – this was not the case :(


Steve


From: Alain RODRIGUEZ <arodr...@gmail.com>
Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Wednesday, 13 April 2016 at 14:48
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: Re: Balancing tokens over 2 datacenter

Hi Steve,

As such, all keyspaces and tables where created on DC1.
The effect of this is that all reads are now going to DC1 and ignoring DC2

I think this is not exactly true. When tables are created, they are created on 
a specific keyspace, no matter where you send the alter schema command, schema 
will propagate to all the datacenters the keyspace is replicated to.

So the question is: Is your keyspace using 'DC1: 3, DC2: 3' as replication 
factors? Could you show us the schema and a nodetool status as well?

WE’ve tried doing , nodetool repair / cleanup – but the reads always go to DC1

Trying to do random things is often not a good idea. For example, as each node 
holds 100% of the data, cleanup is an expensive no-op :-).

Anyone know how to rebalance the tokens over DC’s?

Yes, I can help on that, but I need to know your current status.

Basically, your(s) keyspace(s) must be using RF of 3 on the 2 DCs as mentioned, 
your client to be configured to stick to the DC in their zone (use a DCAware 
policy with the DC name + LOCAL_ONE/QUORUM, see Bhuvan's links) and things 
should be better.

If you need more detailed help, let us know what is unclear to you and provide 
us with 'nodetool status' output and with your schema (at least keyspaces 
config).

C*heers,
-----------------------
Alain Rodriguez - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com







2016-04-13 15:32 GMT+02:00 Bhuvan Rawal <bhu1ra...@gmail.com>:
This could be because of the way you have configured the policy, have a look at 
the below links for configuring the policy

https://datastax.github.io/python-driver/api/cassandra/policies.html

http://stackoverflow.com/questions/22813045/ability-to-write-to-a-particular-cassandra-node

Regards,
Bhuvan

On Wed, Apr 13, 2016 at 6:54 PM, Walsh, Stephen <stephen.wa...@aspect.com> 
wrote:
Hi there,

So we have 2 datacenter with 3 nodes each.
Replication factor is 3 per DC (so each node has all data)

We have an application in each DC that writes that Cassandra DC.

Now, due to a miss configuration in our application, we saw that our 
application in both DC’s where pointing to DC1.

As such, all keyspaces and tables where created on DC1.
The effect of this is that all reads are now going to DC1 and ignoring DC2

WE’ve tried doing , nodetool repair / cleanup – but the reads always go to DC1?

Anyone know how to rebalance the tokens over DC’s?


Regards
Steve


P.S I know about this article
http://www.datastax.com/dev/blog/balancing-your-cassandra-cluster
But its doesn’t answer my question regarding 2 DC’s token balancing

This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute this email or its 
attachments.


This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute this email or its 
attachments.

This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute this email or its 
attachments.
This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute this email or its 
attachments.

This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute this email or its 
attachments.

Re: Balancing tokens over 2 datacenter

Reply via email to