Re: Cassandra Rack - Datacenter Load Balancing relations

Sergio Wed, 23 Oct 2019 15:04:11 -0700

Thanks, Jon!

I just added the AZ for each rack on the right column.
However thanks for your reply and clarification.
Maybe I should have marked the rack names with RACK-READ and RACK-WRITE to
avoid confusion and not use ONE and TWO.


What's more, fault-tolerant between with RF=3:

A) spread each DC across 3 AZ
B) assign to each DC a separate AZ

I assume that I should adjust the consistency level accordingly in case of
failures:
If I have 3 nodes and 1 goes down with RF = 3 and LOCAL_QUORUM consistency
I should downgrade to LOCAL_ONE if I want to keep serving traffic for reads.

Best,

Sergio





Il giorno mer 23 ott 2019 alle ore 14:12 Jon Haddad <j...@jonhaddad.com> ha
scritto:

> Oh, my bad.  There was a flood of information there, I didn't realize you
> had switched to two DCs.  It's been a long day.
>
> I'll be honest, it's really hard to read your various options as you've
> intermixed terminology from AWS and Cassandra in a weird way and there's
> several pages of information here to go through.  I don't have time to
> decipher it, sorry.
>
> Spread a DC across 3 AZs if you want to be fault tolerant and will use
> RF=3, use a single AZ if you don't care about full DC failure in the case
> of an AZ failure or you're not using RF=3.
>
>
> On Wed, Oct 23, 2019 at 4:56 PM Sergio <lapostadiser...@gmail.com> wrote:
>
>> OPTION C or OPTION A?
>>
>> Which one are you referring to?
>>
>> Both have separate DCs to keep the workload separate.
>>
>>    - OPTION A)
>>    - Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>>    - 3 read ONE us-east-1a
>>    - 4 write TWO us-east-1b 5 write TWO us-east-1b
>>    - 6 write TWO us-east-1b
>>
>>
>> Here we have 2 DC read and write
>> One Rack per DC
>> One Availability Zone per DC
>>
>> Thanks,
>>
>> Sergio
>>
>>
>> On Wed, Oct 23, 2019, 1:11 PM Jon Haddad <j...@jonhaddad.com> wrote:
>>
>>> Personally, I wouldn't ever do this.  I recommend separate DCs if you
>>> want to keep workloads separate.
>>>
>>> On Wed, Oct 23, 2019 at 4:06 PM Sergio <lapostadiser...@gmail.com>
>>> wrote:
>>>
>>>>           I forgot to comment for
>>>>
>>>>    OPTION C)
>>>>    1. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>>>>    2. 3 read ONE us-east-1c
>>>>    3. 4 write TWO us-east-1a 5 write TWO us-east-1b
>>>>    4. 6 write TWO us-east-1c I would expect that I need to decrease
>>>>    the Consistency Level in the reads if one of the AZ goes down. Please
>>>>    consider the below one as the real OPTION A. The previous one looks to 
>>>> be
>>>>    wrong because the same rack is assigned to 2 different DC.
>>>>    5. OPTION A)
>>>>    6. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>>>>    7. 3 read ONE us-east-1a
>>>>    8. 4 write TWO us-east-1b 5 write TWO us-east-1b
>>>>    9. 6 write TWO us-east-1b
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Sergio
>>>>
>>>> Il giorno mer 23 ott 2019 alle ore 12:33 Sergio <
>>>> lapostadiser...@gmail.com> ha scritto:
>>>>
>>>>> Hi Reid,
>>>>>
>>>>> Thank you very much for clearing these concepts for me.
>>>>> https://community.datastax.com/comments/1133/view.html I posted this
>>>>> question on the datastax forum regarding our cluster that it is unbalanced
>>>>> and the reply was related that the *number of racks should be a
>>>>> multiplier of the replication factor *in order to be balanced or 1. I
>>>>> thought then if I have 3 availability zones I should have 3 racks for each
>>>>> datacenter and not 2 (us-east-1b, us-east-1a) as I have right now or in 
>>>>> the
>>>>> easiest way, I should have a rack for each datacenter.
>>>>>
>>>>>
>>>>>
>>>>>    1. Datacenter: live
>>>>>    ================
>>>>>    Status=Up/Down
>>>>>    |/ State=Normal/Leaving/Joining/Moving
>>>>>    --  Address      Load       Tokens       Owns    Host ID
>>>>>                        Rack
>>>>>    UN  10.1.20.49   289.75 GiB  256          ?
>>>>>    be5a0193-56e7-4d42-8cc8-5d2141ab4872  us-east-1a
>>>>>    UN  10.1.30.112  103.03 GiB  256          ?
>>>>>    e5108a8e-cc2f-4914-a86e-fccf770e3f0f  us-east-1b
>>>>>    UN  10.1.19.163  129.61 GiB  256          ?
>>>>>    3c2efdda-8dd4-4f08-b991-9aff062a5388  us-east-1a
>>>>>    UN  10.1.26.181  145.28 GiB  256          ?
>>>>>    0a8f07ba-a129-42b0-b73a-df649bd076ef  us-east-1b
>>>>>    UN  10.1.17.213  149.04 GiB  256          ?
>>>>>    71563e86-b2ae-4d2c-91c5-49aa08386f67  us-east-1a
>>>>>    DN  10.1.19.198  52.41 GiB  256          ?
>>>>>    613b43c0-0688-4b86-994c-dc772b6fb8d2  us-east-1b
>>>>>    UN  10.1.31.60   195.17 GiB  256          ?
>>>>>    3647fcca-688a-4851-ab15-df36819910f4  us-east-1b
>>>>>    UN  10.1.25.206  100.67 GiB  256          ?
>>>>>    f43532ad-7d2e-4480-a9ce-2529b47f823d  us-east-1b
>>>>>    So each rack label right now matches the availability zone and we
>>>>>    have 3 Datacenters and 2 Availability Zone with 2 racks per DC but the
>>>>>    above is clearly unbalanced
>>>>>    If I have a keyspace with a replication factor = 3 and I want to
>>>>>    minimize the number of nodes to scale up and down the cluster and keep 
>>>>> it
>>>>>    balanced should I consider an approach like OPTION A)
>>>>>    2. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>>>>>    3. 3 read ONE us-east-1a
>>>>>    4. 4 write ONE us-east-1b 5 write ONE us-east-1b
>>>>>    5. 6 write ONE us-east-1b
>>>>>    6. OPTION B)
>>>>>    7. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>>>>>    8. 3 read ONE us-east-1a
>>>>>    9. 4 write TWO us-east-1b 5 write TWO us-east-1b
>>>>>    10. 6 write TWO us-east-1b
>>>>>    11. *7 read ONE us-east-1c 8 write TWO us-east-1c*
>>>>>    12. *9 read ONE us-east-1c* Option B looks to be unbalanced and I
>>>>>    would exclude it OPTION C)
>>>>>    13. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>>>>>    14. 3 read ONE us-east-1c
>>>>>    15. 4 write TWO us-east-1a 5 write TWO us-east-1b
>>>>>    16. 6 write TWO us-east-1c
>>>>>    17.
>>>>>
>>>>>
>>>>>    so I am thinking of A if I have the restriction of 2 AZ but I
>>>>>    guess that option C would be the best. If I have to add another DC for
>>>>>    reads because we want to assign a new DC for each new microservice it 
>>>>> would
>>>>>    look like:
>>>>>       OPTION EXTRA DC For Reads
>>>>>       1. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>>>>>       2. 3 read ONE us-east-1c
>>>>>       3. 4 write TWO us-east-1a 5 write TWO us-east-1b
>>>>>       4. 6 write TWO us-east-1c 7 extra-read THREE us-east-1a
>>>>>       5. 8 extra-read THREE us-east-1b
>>>>>       6.
>>>>>          7.
>>>>>
>>>>>
>>>>>    1. 9 extra-read THREE us-east-1c
>>>>>       2.
>>>>>    The DC for *write* will replicate the data in the other
>>>>>    datacenters. My scope is to keep the *read* machines dedicated to
>>>>>    serve reads and *write* machines to serve writes. Cassandra will
>>>>>    handle the replication for me. Is there any other option that is I 
>>>>> missing
>>>>>    or wrong assumption? I am thinking that I will write a blog post about 
>>>>> all
>>>>>    my learnings so far, thank you very much for the replies Best, Sergio
>>>>>
>>>>>
>>>>> Il giorno mer 23 ott 2019 alle ore 10:57 Reid Pinchback <
>>>>> rpinchb...@tripadvisor.com> ha scritto:
>>>>>
>>>>>> No, that’s not correct.  The point of racks is to help you distribute
>>>>>> the replicas, not further-replicate the replicas.  Data centers are what 
>>>>>> do
>>>>>> the latter.  So for example, if you wanted to be able to ensure that you
>>>>>> always had quorum if an AZ went down, then you could have two DCs where 
>>>>>> one
>>>>>> was in each AZ, and use one rack in each DC.  In your situation I think 
>>>>>> I’d
>>>>>> be more tempted to consider that.  Then if an AZ went away, you could 
>>>>>> fail
>>>>>> over your traffic to the remaining DC and still be perfectly fine.
>>>>>>
>>>>>>
>>>>>>
>>>>>> For background on replicas vs racks, I believe the information you
>>>>>> want is under the heading ‘NetworkTopologyStrategy’ at:
>>>>>>
>>>>>> http://cassandra.apache.org/doc/latest/architecture/dynamo.html
>>>>>>
>>>>>>
>>>>>>
>>>>>> That should help you better understand how replicas distribute.
>>>>>>
>>>>>>
>>>>>>
>>>>>> As mentioned before, while you can choose to do the reads in one DC,
>>>>>> except for concerns about contention related to network traffic and
>>>>>> connection handling, you can’t isolate reads from writes.  You can _
>>>>>> *mostly*_ insulate the write DC from the activity within the read
>>>>>> DC, and even that isn’t an absolute because of repairs.  However, your
>>>>>> mileage may vary, so do what makes sense for your usage pattern.
>>>>>>
>>>>>>
>>>>>>
>>>>>> R
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From: *Sergio <lapostadiser...@gmail.com>
>>>>>> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
>>>>>> *Date: *Wednesday, October 23, 2019 at 12:50 PM
>>>>>> *To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
>>>>>> *Subject: *Re: Cassandra Rack - Datacenter Load Balancing relations
>>>>>>
>>>>>>
>>>>>>
>>>>>> *Message from External Sender*
>>>>>>
>>>>>> Hi Reid,
>>>>>>
>>>>>> Thanks for your reply. I really appreciate your explanation.
>>>>>>
>>>>>> We are in AWS and we are using right now 2 Availability Zone and not
>>>>>> 3. We found our cluster really unbalanced because the keyspace has a
>>>>>> replication factor = 3 and the number of racks is 2 with 2 datacenters.
>>>>>> We want the writes spread across all the nodes but we wanted the
>>>>>> reads isolated from the writes to keep the load on that node low and to 
>>>>>> be
>>>>>> able to identify problems in the consumers (reads) or producers (writes)
>>>>>> applications.
>>>>>> It looks like that each rack contains an entire copy of the data so
>>>>>> this would lead to replicate for each rack and then for each node the
>>>>>> information. If I am correct if we have  a keyspace with 100GB and
>>>>>> Replication Factor = 3 and RACKS = 3 => 100 * 3 * 3 = 900GB
>>>>>> If I had only one rack across 2 or even 3 availability zone I would
>>>>>> save in space and I would have 300GB only. Please correct me if I am 
>>>>>> wrong.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Sergio
>>>>>>
>>>>>>
>>>>>>
>>>>>> Il giorno mer 23 ott 2019 alle ore 09:21 Reid Pinchback <
>>>>>> rpinchb...@tripadvisor.com> ha scritto:
>>>>>>
>>>>>> Datacenters and racks are different concepts.  While they don't have
>>>>>> to be associated with their historical meanings, the historical meanings
>>>>>> probably provide a helpful model for understanding what you want from 
>>>>>> them.
>>>>>>
>>>>>> When companies own their own physical servers and have them housed
>>>>>> somewhere, the questions arise on where you want to locate any particular
>>>>>> server.  It's a balancing act on things like network speed of related
>>>>>> servers being able to talk to each other, versus fault-tolerance of 
>>>>>> having
>>>>>> many servers not all exposed to the same risks.
>>>>>>
>>>>>> "Same rack" in that physical world tended to mean something like "all
>>>>>> behind the same network switch and all sharing the same power bus".  The
>>>>>> morning after an electrical glitch fries a power bus and thus everything 
>>>>>> in
>>>>>> that rack, you realize you wished you didn't have so many of the same 
>>>>>> type
>>>>>> of server together.  Well, they were servers.  Now they are door stops.
>>>>>> Badness and sadness.
>>>>>>
>>>>>> That's kind of the mindset to have in mind with racks in Cassandra.
>>>>>> It's an artifact for you to separate servers into pools so that the
>>>>>> disparate pools have hopefully somewhat independent infrastructure risks.
>>>>>> However, all those servers are still doing the same kind of work, are the
>>>>>> same version, etc.
>>>>>>
>>>>>> Datacenters are amalgams of those racks, and how similar or different
>>>>>> they are from each other depends on what you want to do with them.  What 
>>>>>> is
>>>>>> true is that if you have N datacenters, each one of them must have enough
>>>>>> disk storage to house all the data.  The actual physical footprint of 
>>>>>> that
>>>>>> data in each DC depends on the replication factors in play.
>>>>>>
>>>>>> Note that you sorta can't have "one datacenter for writes" because
>>>>>> the writes will replicate across the data centers.  You could definitely
>>>>>> choose to have only one that takes read queries, but best to think of
>>>>>> writing as being universal.  One scenario you can have is where the DC 
>>>>>> not
>>>>>> taking live traffic read queries is the one you use for maintenance or
>>>>>> performance testing or version upgrades.
>>>>>>
>>>>>> One rack makes your life easier if you don't have a reason for
>>>>>> multiple racks. It depends on the environment you deploy into and your
>>>>>> fault tolerance goals.  If you were in AWS and wanting to spread risk
>>>>>> across availability zones, then you would likely have as many racks as 
>>>>>> AZs
>>>>>> you choose to be in, because that's really the point of using multiple 
>>>>>> AZs.
>>>>>>
>>>>>> R
>>>>>>
>>>>>>
>>>>>> On 10/23/19, 4:06 AM, "Sergio Bilello" <lapostadiser...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>      Message from External Sender
>>>>>>
>>>>>>     Hello guys!
>>>>>>
>>>>>>     I was reading about
>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64RwXSXfAkOWtmkq4&e=
>>>>>>
>>>>>>     I would like to understand a concept related to the node load
>>>>>> balancing.
>>>>>>
>>>>>>     I know that Jon recommends Vnodes = 4 but right now I found a
>>>>>> cluster with vnodes = 256 replication factor = 3 and 2 racks. This is
>>>>>> unbalanced because the racks are not a multiplier of the replication 
>>>>>> factor.
>>>>>>
>>>>>>     However, my plan is to move all the nodes in a single rack to
>>>>>> eventually scale up and down the node in the cluster once at the time.
>>>>>>
>>>>>>     If I had 3 racks and I would like to keep the things balanced I
>>>>>> should scale up 3 nodes at the time one for each rack.
>>>>>>
>>>>>>     If I would have 3 racks, should I have also 3 different
>>>>>> datacenters so one datacenter for each rack?
>>>>>>
>>>>>>     Can I have 2 datacenters and 3 racks? If this is possible one
>>>>>> datacenter would have more nodes than the others? Could it be a problem?
>>>>>>
>>>>>>     I am thinking to split my cluster in one datacenter for reads and
>>>>>> one for writes and keep all the nodes in the same rack so I can scale up
>>>>>> once node at the time.
>>>>>>
>>>>>>
>>>>>>
>>>>>>     Please correct me if I am wrong
>>>>>>
>>>>>>
>>>>>>
>>>>>>     Thanks,
>>>>>>
>>>>>>
>>>>>>
>>>>>>     Sergio
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>>     To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>>>>>
>>>>>>     For additional commands, e-mail: user-h...@cassandra.apache.org
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>

Re: Cassandra Rack - Datacenter Load Balancing relations

Reply via email to