Re: Multi-DC Repairs and Token Questions

Nick Bailey Mon, 02 Jun 2014 23:07:25 -0700

See https://issues.apache.org/jira/browse/CASSANDRA-7317



On Mon, Jun 2, 2014 at 8:57 PM, Matthew Allen <matthew.j.al...@gmail.com>
wrote:

> Hi Rameez, Chovatia, (sorry I initially replied to Dwight individually)
>
> SN_KEYSPACE and MY_KEYSPACE are just typos (was try to mask out
> identifiable information), they are same keyspace.
>
> Keyspace: SN_KEYSPACE:
>   Replication Strategy:
> org.apache.cassandra.locator.NetworkTopologyStrategy
>   Durable Writes: true
>     Options: [DC_VIC:2, DC_NSW:2]
>
> In a nutshell, replication is working as expected, I'm just confused about
> token range assignments in a Multi-DC environment and how repairs should
> work
>
> From
> http://www.datastax.com/documentation/cassandra/1.2/cassandra/configuration/configGenTokens_c.html,
> it specifies
>
> *        "Multiple data center deployments: calculate the tokens for each
> data center so that the hash range is evenly divided for the nodes in each
> data center"*
>
> Given that nodetool -repair isn't multi-dc aware, in our production 18
> node cluster (9 nodes in each DC), which of the following token ranges
> should be used (Murmur3 Partitioner) ?
>
> Token range divided evenly over the 2 DC's/18 nodes as below ?
>
> Node DC_NSW                    DC_VIC
> 1    '-9223372036854775808'    '-8198552921648689608'
> 2    '-7173733806442603408'    '-6148914691236517208'
> 3    '-5124095576030431008'    '-4099276460824344808'
> 4    '-3074457345618258608'    '-2049638230412172408'
> 5    '-1024819115206086208'    '-8'
> 6    '1024819115206086192'     '2049638230412172392'
> 7    '3074457345618258592'     '4099276460824344792'
> 8    '5124095576030430992'     '6148914691236517192'
> 9    '7173733806442603392'     '8198552921648689592'
>
> Or An offset used for DC_VIC (i.e. DC_NSW + 100) ?
>
> Node     DC_NSW                 DC_VIC
> 1     '-9223372036854775808'    '-9223372036854775708'
> 2     '-7173733806442603407'    '-7173733806442603307'
> 3     '-5124095576030431006'    '-5124095576030430906'
> 4     '-3074457345618258605'    '-3074457345618258505'
> 5     '-1024819115206086204'    '-1024819115206086104'
> 6     '1024819115206086197'     '1024819115206086297'
> 7     '3074457345618258598'     '3074457345618258698'
> 8     '5124095576030430999'     '5124095576030431099'
> 9     '7173733806442603400'     '7173733806442603500'
>
> It's too late for me to switch to vnodes, hope that makes sense, thanks
>
> Matt
>
>
>
> On Thu, May 29, 2014 at 12:01 AM, Rameez Thonnakkal <ssram...@gmail.com>
> wrote:
>
>> as Chovatia mentioned, the keyspaces seems to be different.
>> try "Describe keyspace SN_KEYSPACE" and "describe keyspace MY_KEYSPACE"
>> from CQL.
>> This will give you an idea about how many replicas are there for these
>> keyspaces.
>>
>>
>>
>> On Wed, May 28, 2014 at 11:49 AM, chovatia jaydeep <
>> chovatia_jayd...@yahoo.co.in> wrote:
>>
>>> What is your partition type? Is
>>> it org.apache.cassandra.dht.Murmur3Partitioner?
>>> In your repair command i do see there are two different KeySpaces 
>>> "MY_KEYSPACE"
>>> and "SN_KEYSPACE", are these two separate key spaces or typo?
>>>
>>> -jaydeep
>>>
>>>
>>>   On Tuesday, 27 May 2014 10:26 PM, Matthew Allen <
>>> matthew.j.al...@gmail.com> wrote:
>>>
>>>
>>> Hi,
>>>
>>> Am a bit confused regarding data ownership in a multi-dc environment.
>>>
>>> I have the following setup in a test cluster with a keyspace with
>>> (placement_strategy = 'NetworkTopologyStrategy' and strategy_options =
>>> {'DC_NSW':2,'DC_VIC':2};)
>>>
>>> Datacenter: DC_NSW
>>> ==========
>>> Replicas: 2
>>> Address         Rack        Status State   Load
>>> Owns                Token
>>>
>>> 0
>>> nsw1  rack1       Up     Normal  1007.43 MB      100.00%
>>> -9223372036854775808
>>> nsw2  rack1       Up     Normal  1008.08 MB      100.00%             0
>>>
>>>
>>> Datacenter: DC_VIC
>>> ==========
>>> Replicas: 2
>>> Address         Rack        Status State   Load
>>> Owns                Token
>>>
>>> 100
>>> vic1   rack1       Up     Normal  1015.1 MB       100.00%
>>> -9223372036854775708
>>> vic2   rack1       Up     Normal  1015.13 MB      100.00%             100
>>>
>>> My understanding is that both Datacenters have a complete copy of the
>>> data, but when I run a repair -pr on each of the nodes, the vic hosts only
>>> take a couple of seconds, while the nsw nodes take about 5 minutes each.
>>>
>>> Does this mean that nsw nodes "own" the majority of the data given their
>>> key ranges and that repairs will need to cross datacenters ?
>>>
>>> Thanks
>>>
>>> Matt
>>>
>>> command>nodetool -h vic1 repair -pr   (takes seconds)
>>> Starting NodeTool
>>> [2014-05-28 15:11:02,783] Starting repair command #1, repairing 1 ranges
>>> for keyspace MY_KEYSPACE
>>> [2014-05-28 15:11:03,110] Repair session
>>> 76d170f0-e626-11e3-af4e-218541ad23a1 for range
>>> (-9223372036854775808,-9223372036854775708] finished
>>> [2014-05-28 15:11:03,110] Repair command #1 finished
>>> [2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system'
>>> [2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system_traces'
>>>
>>> command>nodetool -h vic2 repair -pr (takes seconds)
>>> Starting NodeTool
>>> [2014-05-28 15:11:28,746] Starting repair command #1, repairing 1 ranges
>>> for keyspace MY_KEYSPACE
>>> [2014-05-28 15:11:28,840] Repair session
>>> 864b14a0-e626-11e3-9612-07b0c029e3c7 for range (0,100] finished
>>> [2014-05-28 15:11:28,840] Repair command #1 finished
>>> [2014-05-28 15:11:28,866] Nothing to repair for keyspace 'system'
>>> [2014-05-28 15:11:28,866] Nothing to repair for keyspace 'system_traces'
>>>
>>> command>nodetool -h nsw1 repair -pr (takes minutes)
>>> Starting NodeTool
>>> [2014-05-28 15:11:32,579] Starting repair command #1, repairing 1 ranges
>>> for keyspace SN_KEYSPACE
>>> [2014-05-28 15:14:07,187] Repair session
>>> 88966430-e626-11e3-81eb-c991646ac2bf for range (100,-9223372036854775808]
>>> finished
>>> [2014-05-28 15:14:07,187] Repair command #1 finished
>>> [2014-05-28 15:14:11,393] Nothing to repair for keyspace 'system'
>>> [2014-05-28 15:14:11,440] Nothing to repair for keyspace 'system_traces'
>>>
>>> command>nodetool -h nsw2 repair -pr (takes minutes)
>>> Starting NodeTool
>>> [2014-05-28 15:14:18,670] Starting repair command #1, repairing 1 ranges
>>> for keyspace SN_KEYSPACE
>>> [2014-05-28 15:17:27,300] Repair session
>>> eb936ce0-e626-11e3-81e2-8790242f886e for range (-9223372036854775708,0]
>>> finished
>>> [2014-05-28 15:17:27,300] Repair command #1 finished
>>> [2014-05-28 15:17:32,017] Nothing to repair for keyspace 'system'
>>> [2014-05-28 15:17:32,064] Nothing to repair for keyspace 'system_traces'
>>>
>>>
>>>
>>
>

Re: Multi-DC Repairs and Token Questions

Reply via email to