Re: Shifting data to DCOS

Faraz Mateen Wed, 02 May 2018 04:48:51 -0700

Hi all,

Sorry I couldn't update earlier as I got caught up in some other stuff.

Anyway, my previous 3 node cluster was on version 3.9.  I created a new
cluster of cassandra 3.11.2 with same number of nodes on GCE VMs instead of
DC/OS. My existing cluster has cassandra data on persistent disks. I made
copies of those disks and attached them to new cluster.

I was using the following link to move data to the new cluster:
https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsSnapshotRestoreNewCluster.html

As mentioned in the link, I manually assigned token ranges to each node
according to their corresponding node in the previous cluster. When I
restarted cassandra process on the VMs, I noticed that it had automatically
picked up all my keyspaces and column families. I did not recreate schema
or copy data manually or run sstablesloader. I am not sure if this should
have happened.

Anyway, the data in both clusters is still not in sync. I ran a simple
count query on a table both clusters and got different results:

Old cluster: 217699
New Cluster: 138770

On the new cluster, when I run nodetool repair for my keyspace, it runs
fine on one node, but on other two nodes it says that keyspace replication
factor is 1 so repair is not needed. Cqlsh also shows that the replication
factor is 2.

Nodetool status on new and old cluster shows different outputs for each
cluster as well.

*Cluster1:*
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       Tokens       Owns    Host ID
          Rack
UN  10.128.1.1  228.14 GiB  256          ?
63ff8054-934a-4a7a-a33f-405e064bc8e8  rack1
UN  10.128.1.2  231.25 GiB  256          ?
702e8a31-6441-4444-b569-d2d137d54a5d  rack1
UN  10.128.1.3  199.91 GiB  256          ?
b5b22a90-f037-433a-8ad9-f370b26cca26  rack1

*Cluster2:*
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       Tokens       Owns    Host ID
          Rack
UJ  10.142.0.4  211.27 GiB  256          ?
c55fef77-9c78-449c-b0d9-64e755caee7d  rack1
UN  10.142.0.2  228.14 GiB  256          ?
0065c8e1-47be-4cf8-a3fe-3f4d20ff1b47  rack1
UJ  10.142.0.3  241.77 GiB  256          ?
f3b3f409-d108-4751-93ba-682692e46318  rack1

This is weird because both the clusters have essentially same disks
attached to them.
 Only one node (10.142.0.2) in cluster2 has the same load as its
counterpart in the cluster1 (10.128.1.1).
This is also the node where nodetool repair seems to be running fine and it
is also acting as the seed node in second cluster.

I am confused that what might be causing this inconsistency in load and
replication factor?  Has anyone ever seen different replication factor for
same keyspace on different nodes? Is there a problem in my workflow?
Can anyone please suggest the best way to move data from one cluster to
another?

Any help will be greatly appreciated.

On Tue, Apr 17, 2018 at 6:52 AM, Faraz Mateen <fmat...@an10.io> wrote:

> Thanks for the response guys.
>
> Let me try setting token ranges manually and move the data again to
> correct nodes. Will update with the outcome soon.
>
>
> On Tue, Apr 17, 2018 at 5:42 AM, kurt greaves <k...@instaclustr.com>
> wrote:
>
>> Sorry for the delay.
>>
>>> Is the problem related to token ranges? How can I find out token range
>>> for each node?
>>> What can I do to further debug and root cause this?
>>
>> Very likely. See below.
>>
>> My previous cluster has 3 nodes but replication factor is 2. I am not
>>> exactly sure how I would handle the tokens. Can you explain that a bit?
>>
>> The new cluster will have to have the same token ring as the old if you
>> are copying from node to node. Basically you should get the set of tokens
>> for each node (from nodetool ring) and when you spin up your 3 new nodes,
>> set initial_tokens in the yaml to be the comma-separated list of tokens for 
>> *exactly
>> one* node from the previous cluster. When restoring the SSTables you
>> need to make sure you take the SSTables from the original node and place it
>> on the new node that has the *same* list of tokens. If you don't do this
>> it won't be a replica for all the data in those SSTables and consequently
>> you'll lose data (or it simply won't be available).
>> 
>>
>
>
>
> --
> Faraz Mateen
>

-- 
Faraz Mateen

Re: Shifting data to DCOS

Reply via email to