Re: Docs: Token Selection

2011-06-17 Thread AJ
On 6/17/2011 1:27 PM, Sasha Dolgy wrote: Replication factor is defined per keyspace if i'm not mistaken. Can't remember if NTS is per keyspace or per cluster ... if it's per keyspace, that would be a way around it ... without having to maintain multiple clusters just have multiple keyspaces

Re: Docs: Token Selection

2011-06-17 Thread Sasha Dolgy
Replication factor is defined per keyspace if i'm not mistaken. Can't remember if NTS is per keyspace or per cluster ... if it's per keyspace, that would be a way around it ... without having to maintain multiple clusters just have multiple keyspaces ... On Fri, Jun 17, 2011 at 9:23 PM, AJ

Re: Docs: Token Selection

2011-06-17 Thread AJ
On 6/17/2011 12:32 PM, Jeremiah Jordan wrote: Run two clusters, one which has {DC1:2, DC2:1} and one which is {DC1:1,DC2:2}. You can't have both in the same cluster, otherwise it isn't possible to tell where the data got written when you want to read it. For a given key "XYZ" you must b

Re: Docs: Token Selection

2011-06-17 Thread AJ
On 6/17/2011 12:33 PM, Eric tamme wrote: As i said previously, trying to build make cassandra treat things differently based on some kind of persistent locality set it maintains in memory .. or whatever .. sounds like you will be absolutely undermining the core principles of how cassandra works.

Re: Docs: Token Selection

2011-06-17 Thread Eric tamme
> Yes.  But, the more I think about it, the more I see issues.  Here is what I > envision (Issues marked with *): > > Three or more dc's, each serving as fail-overs for the others with 1 maximum > unavailable dc supported at a time. > Each dc is a production dc serving users that I choose. > Each d

RE: Docs: Token Selection

2011-06-17 Thread Jeremiah Jordan
,DC2:2}. -Original Message- From: AJ [mailto:a...@dude.podzone.net] Sent: Friday, June 17, 2011 1:02 PM To: user@cassandra.apache.org Subject: Re: Docs: Token Selection Hi Jeremiah, can you give more details? Thanks On 6/17/2011 10:49 AM, Jeremiah Jordan wrote: > Run two Cassandra clusters... >

Re: Docs: Token Selection

2011-06-17 Thread AJ
Hi Jeremiah, can you give more details? Thanks On 6/17/2011 10:49 AM, Jeremiah Jordan wrote: Run two Cassandra clusters... -Original Message- From: Eric tamme [mailto:eta...@gmail.com] Sent: Friday, June 17, 2011 11:31 AM To: user@cassandra.apache.org Subject: Re: Docs: Token

Re: Docs: Token Selection

2011-06-17 Thread AJ
On 6/17/2011 10:31 AM, Eric tamme wrote: What I don't like about NTS is I would have to have more replicas than I need. {DC1=2, DC2=2}, RF=4 would be the minimum. If I felt that 2 local replicas was insufficient, I'd have to move up to RF=6 which seems like a waste... I'm predicting data in the

Re: Docs: Token Selection

2011-06-17 Thread AJ
+1 Yes, that is what I'm talking about Eric. Maybe I could write my own strategy, I dunno. I'll have to understand more first. On 6/17/2011 10:37 AM, Sasha Dolgy wrote: +1 for this if it is possible... On Fri, Jun 17, 2011 at 6:31 PM, Eric tamme wrote: What I don't like about NTS is I wou

RE: Docs: Token Selection

2011-06-17 Thread Jeremiah Jordan
Run two Cassandra clusters... -Original Message- From: Eric tamme [mailto:eta...@gmail.com] Sent: Friday, June 17, 2011 11:31 AM To: user@cassandra.apache.org Subject: Re: Docs: Token Selection > What I don't like about NTS is I would have to have more replicas than > I ne

Re: Docs: Token Selection

2011-06-17 Thread Sasha Dolgy
+1 for this if it is possible... On Fri, Jun 17, 2011 at 6:31 PM, Eric tamme wrote: >> What I don't like about NTS is I would have to have more replicas than I >> need.  {DC1=2, DC2=2}, RF=4 would be the minimum.  If I felt that 2 local >> replicas was insufficient, I'd have to move up to RF=6 wh

Re: Docs: Token Selection

2011-06-17 Thread Eric tamme
> What I don't like about NTS is I would have to have more replicas than I > need.  {DC1=2, DC2=2}, RF=4 would be the minimum.  If I felt that 2 local > replicas was insufficient, I'd have to move up to RF=6 which seems like a > waste... I'm predicting data in the TB range so I'm trying to keep rep

Re: Docs: Token Selection

2011-06-17 Thread Eric tamme
On Fri, Jun 17, 2011 at 12:07 PM, AJ wrote: > Thanks Jonathan.  I assumed since each data center owned the full key space > that the first replica would be stored in the dc of the coordinating node, > the 2nd in another dc, and the 3rd+ back in the 1st dc.  But, are you saying > that the first end

Re: Docs: Token Selection

2011-06-17 Thread AJ
On 6/17/2011 7:26 AM, William Oberman wrote: I haven't done it yet, but when I researched how to make geo-diverse/failover DCs, I figured I'd have to do something like RF=6, strategy = {DC1=3, DC2=3}, and LOCAL_QUORUM for reads/writes. This gives you an "ack" after 2 local nodes do the read/wr

Re: Docs: Token Selection

2011-06-17 Thread AJ
Thanks Jonathan. I assumed since each data center owned the full key space that the first replica would be stored in the dc of the coordinating node, the 2nd in another dc, and the 3rd+ back in the 1st dc. But, are you saying that the first endpoint is selected regardless of the location of t

Re: Docs: Token Selection

2011-06-17 Thread William Oberman
I haven't done it yet, but when I researched how to make geo-diverse/failover DCs, I figured I'd have to do something like RF=6, strategy = {DC1=3, DC2=3}, and LOCAL_QUORUM for reads/writes. This gives you an "ack" after 2 local nodes do the read/write, but the data eventually gets distributed to

Re: Docs: Token Selection

2011-06-16 Thread Jonathan Ellis
Replication location is determined by the row key, not the location of the client that inserted it. (Otherwise, without knowing what DC a row was inserted in, you couldn't look it up to read it!) On Fri, Jun 17, 2011 at 12:20 AM, AJ wrote: > On 6/16/2011 9:45 PM, aaron morton wrote: >>> >>> But,

Re: Docs: Token Selection

2011-06-16 Thread AJ
On 6/16/2011 9:45 PM, aaron morton wrote: But, I'm thinking about using OldNetworkTopStrat. NetworkTopologyStrategy is where it's at. Oh yeah? It didn't look like it would serve my requirements. I want 2 full production geo-diverse data centers with each serving as a failover for the other

Re: Docs: Token Selection

2011-06-16 Thread aaron morton
> But, I'm thinking about using OldNetworkTopStrat. NetworkTopologyStrategy is where it's at. A - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 17 Jun 2011, at 01:39, AJ wrote: > Thanks Eric! I've finally got it! I feel like I've jus

Re: Docs: Token Selection

2011-06-16 Thread Eric tamme
On Thu, Jun 16, 2011 at 11:11 AM, Sasha Dolgy wrote: > So, with ec2 ... 3 regions (DC's), each one is +1 from another? I dont use ec2, so I am not familiar with the specifics of deployment there. That said, if you have 3 data centers with equal nodes in each (so that you would calculate the

Re: Docs: Token Selection

2011-06-16 Thread Sasha Dolgy
So, with ec2 ... 3 regions (DC's), each one is +1 from another? On Jun 16, 2011 3:40 PM, "AJ" wrote: > Thanks Eric! I've finally got it! I feel like I've just been initiated > or something by discovering this "secret". I kid! > > But, I'm thinking about using OldNetworkTopStrat. Do you, or any

Re: Docs: Token Selection

2011-06-16 Thread AJ
Thanks Eric! I've finally got it! I feel like I've just been initiated or something by discovering this "secret". I kid! But, I'm thinking about using OldNetworkTopStrat. Do you, or anyone else, know if the same rules for token assignment applies to ONTS? On 6/16/2011 7:21 AM, Eric tamme

Re: Docs: Token Selection

2011-06-16 Thread AJ
LOL, I feel Eric's pain. This double-ring thing can throw you for a loop since, like I said, there is only one place it is documented and it is only *implied*, so one is not sure he is interpreting it correctly. Even the source for NTS doesn't mention this. Thanks for everyone's help on this

Re: Docs: Token Selection

2011-06-16 Thread Eric tamme
AJ, sorry I seemed to miss the original email on this thread. As Aaron said, when computing tokens for multiple data centers, you should compute them independently for each data center - as if it were its own Cassandra cluster. You can have "overlapping" token ranges between multiple data center

Re: Docs: Token Selection

2011-06-16 Thread aaron morton
See this thread for background http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Replica-data-distributing-between-racks-td6324819.html In a multi DC environment, if you calculate the initial tokens for the entire cluster data will not be evenly distributed. Cheers

Re: Docs: Token Selection

2011-06-15 Thread Vijay
+1 for more documentation (I guess contributions are always welcomed) I will try to write it down sometime when we have a bit more time... 0.8 nodetool ring command adds the DC and RAC information http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers http://www

Re: Docs: Token Selection

2011-06-15 Thread AJ
Ok. I understand the reasoning you laid out. But, I think it should be documented more thoroughly. I was trying to get an idea as to how flexible Cass lets you be with the various combinations of strategies, snitches, token ranges, etc.. It would be instructional to see what a graphical rep

Re: Docs: Token Selection

2011-06-15 Thread Vijay
No it wont it will assume you are doing the right thing... Regards, On Wed, Jun 15, 2011 at 2:34 PM, AJ wrote: > Vijay, thank you for your thoughtful reply. Will Cass complain if I don't > setup my tokens like in the examples? > > > On 6/15/2011 2:41 PM, Vijay wrote: > > All you heard

Re: Docs: Token Selection

2011-06-15 Thread AJ
Vijay, thank you for your thoughtful reply. Will Cass complain if I don't setup my tokens like in the examples? On 6/15/2011 2:41 PM, Vijay wrote: All you heard is right... You are not overriding Cassandra's token assignment by saying here is your token... Logic is: Calculate a token for th

Re: Docs: Token Selection

2011-06-15 Thread Vijay
All you heard is right... You are not overriding Cassandra's token assignment by saying here is your token... Logic is: Calculate a token for the given key... find the node in each region independently (If you use NTS and if you set the strategy options which says you want to replicate to the othe

Re: Docs: Token Selection

2011-06-15 Thread AJ
On 6/15/2011 12:14 PM, Vijay wrote: Correction "The problem in the above approach is you have 2 nodes between 12 to 4 in DC1 but from 4 to 12 you just have 1" should be "The problem in the above approach is you have 1 node between 0-4 (25%) and and one node covering the rest which is 4

Re: Docs: Token Selection

2011-06-15 Thread Vijay
Correction "The problem in the above approach is you have 2 nodes between 12 to 4 in DC1 but from 4 to 12 you just have 1" should be "The problem in the above approach is you have 1 node between 0-4 (25%) and and one node covering the rest which is 4-16, 0-0 (75%)" Regards, On Wed, Jun

Re: Docs: Token Selection

2011-06-15 Thread Vijay
The problem in the above approach is you have 2 nodes between 12 to 4 in DC1 but from 4 to 12 you just have 1 (Which will cause uneven distribution of data the node) It is easier to think of the DCs as ring and split equally and interleave them together DC1 Node 1 : token 0 DC1 Node 2 : t

Re: Docs: Token Selection

2011-06-14 Thread AJ
Yes, which means that the ranges overlap each other. Is this just a convention, or is it technically required when using NetworkTopologyStrategy? Would it be acceptable to split the ranges into quarters by ignoring the data centers, such as: DC1 node 1 = 0 Range: (12, 16], (0, 0] node 2

Re: Docs: Token Selection

2011-06-14 Thread Vijay
Yes... Thats right... If you are trying to say the below... DC1 Node1 Owns 50% (Ranges 8..4 -> 8..5 & 8..5 -> 0) Node2 Owns 50% (Ranges 0 -> 1 & 1 -> 8..4) DC2 Node1 Owns 50% (Ranges 8..5 -> 0 & 0 -> 1) Node2 Owns 50% (Ranges 1 -> 8..4 & 8..4 -> 8..5) Regards, On Tue, Jun 14, 2011 a

Docs: Token Selection

2011-06-14 Thread AJ
This http://wiki.apache.org/cassandra/Operations#Token_selection says: "With NetworkTopologyStrategy, you should calculate the tokens the nodes in each DC independantly." and gives the example: DC1 node 1 = 0 node 2 = 85070591730234615865843651857942052864 DC2 node 3 = 1 node 4 = 8507059173