Re: Cassandra rack awareness

Edson Marquezani Filho Sat, 28 Feb 2015 05:34:10 -0800

As far as I know, the main thing about using NetworkTopologyStrategy
and different racks is replica placement throughout your cluster. That
strategy favours different racks when it comes to choosing where a
row's replica will be placed. So, if you have different numbers of
nodes in each rack, you will probably end up with an unbalanced
cluster (regarding data occupation), not because of the actual rows
partitioning, but because of the replicas. The effects of it also
depends on you replication factor. (You can sit down and do the math
yourself.)

I had an issue like that sometime ago, because I was not aware of that
behavior and didn't really care about where my machines were, and was
using SimpleStrategy. But when I decided to go for
NetworkTopologyStrategy, I realized I had a bad physical configuration
(4 nodes in a same rack, 1 node in another one), so I had to fake that
last node's rack, as if it was in the same as the other nodes,
otherwise I would have that node alone in the rack with twice the data
amount the other ones had. (As I said, that could even be worse if I
had a higher replication factor.)

To be honest, I'm not sure I fully understand the documentation you
quoted on your first email, specially the last phrase. But, my
(limited) experience with Cassandra (2.1) tells me that if you start
off with a balanced rack setup, I'll be fine. Otherwise, you'll have
to change you node's physical localization or faking it on config
file, and run repair and clean on your entire cluster (which is a pain
in the ass) to get a balanced cluster again. I had to do that. =P

On Sat, Feb 28, 2015 at 6:05 AM, Amlan Roy <amlan....@cleartrip.com> wrote:
> Hi Rob,
>
> Thanks for sharing the link. I have gone through it and few other documents
> as well. Still I am confused. It seems, if we use vnodes and
> NetworkTopologyStrategy, we should use a single rack configuration in
> Cassandra. Or, it can create hotspots in the ring. Not sure if my
> understanding is correct.
>
> Regards.
>
>
> On 28-Feb-2015, at 2:42 am, Robert Coli <rc...@eventbrite.com> wrote:
>
> On Fri, Feb 27, 2015 at 7:30 AM, Amlan Roy <amlan....@cleartrip.com> wrote:
>>
>> I am new to Cassandra and trying to setup a Cassandra 2.0 cluster using 4
>> nodes, 2 each in 2 different racks. All are in same data centre. This is
>> what I see in the documentation:
>>
>> To use racks correctly:
>>
>> Use the same number of nodes in each rack. Use one rack and place the
>> nodes in different racks in an alternating pattern. This allows you to still
>> get the benefits of Cassandra's rack feature, and allows for quick and fully
>> functional expansions. Once the cluster is stable, you can swap nodes and
>> make the appropriate moves to ensure that nodes are placed in the ring in an
>> alternating fashion with respect to the racks.
>>
>> What I have understood is, in cassandra-rackdc.properties, I need to use
>> single rack name even though I have 2 racks and then place the nodes in such
>> an order that they are placed in an alternating fashion - RAC1-NODE1,
>> RAC2-NODE1, RAC1-NODE2, RAC2-NODE2.
>>
>> Just wanted to know if this is correct. If yes, how do I enforce this
>> order while adding nodes.
>
> https://issues.apache.org/jira/browse/CASSANDRA-3810
>
> =Rob
>
>
>

Re: Cassandra rack awareness

Reply via email to