Re: Shifting data to DCOS

Faraz Mateen Sun, 15 Apr 2018 08:24:48 -0700

*UPDATE* - I created schema for all the tables in one of the keypsaces,
copied data to new directories and ran nodetool refresh. However, a lot of
data seems to be missing.


I ran nodetool repair on all three nodes one by one. First two nodes took
around 20 minutes (each) to complete. Third node took a lot of time to
repair and did not complete even in 14 hours. Eventually I had to stop it
manually.

*nodetool compactionstats *give me the "pending tasks by table name"
traceback which can be viewed here:
https://gist.github.com/farazmateen/10adce4b2477457f0e20fc95176f66a3

*nodetool netstats* shows a lot of dropped gossip messages on all the
nodes. Here is the output from one of the nodes:

Mode: NORMALNot sending any streams.Read Repair Statistics:Attempted:
0Mismatch (Blocking): 1Mismatch (Background): 2Pool Name
     Active   Pending      Completed   DroppedLarge messages
       n/a         0             92         1Small messages
      n/a         0         355491         0Gossip messages
     n/a         5        3726945    286613

Is the problem related to token ranges? How can I find out token range for
each node?
What can I do to further debug and root cause this?

On Tue, Apr 10, 2018 at 4:28 PM, Faraz Mateen <fmat...@an10.io> wrote:

> Sorry for the late reply. I was trying to figure out some other approach
> to it.
>
> @Kurt - My previous cluster has 3 nodes but replication factor is 2. I am
> not exactly sure how I would handle the tokens. Can you explain that a bit?
>
> @Michael - Actually, my DC/OS cluster has an older version than my
> previous cluster. However both of them have hash with their data
> directories. Previous cluster is on version 3.9 while new DC/OS cluster is
> on 3.0.16.
>
>
> On Fri, Apr 6, 2018 at 2:35 PM, kurt greaves <k...@instaclustr.com> wrote:
>
>> Without looking at the code I'd say maybe the keyspaces are displayed
>> purely because the directories exist (but it seems unlikely). The process
>> you should follow instead is to exclude the system keyspaces for each node
>> and manually apply your schema, then upload your CFs into the correct
>> directory. Note this only works when RF=#nodes, if you have more nodes you
>> need to take tokens into account when restoring.
>>
>>
>> On Fri., 6 Apr. 2018, 17:16 Affan Syed, <as...@an10.io> wrote:
>>
>>> Michael,
>>>
>>> both of the folders are with hash, so I dont think that would be an
>>> issue.
>>>
>>> What is strange is why the tables dont show up if the keyspaces are
>>> visible. Shouldnt that be a meta data that can be edited once and then be
>>> visible?
>>>
>>> Affan
>>>
>>> - Affan
>>>
>>> On Thu, Apr 5, 2018 at 7:55 PM, Michael Shuler <mich...@pbandjelly.org>
>>> wrote:
>>>
>>>> On 04/05/2018 09:04 AM, Faraz Mateen wrote:
>>>> >
>>>> > For example,  if the table is *data_main_bim_dn_10*, its data
>>>> directory
>>>> > is named data_main_bim_dn_10-a73202c02bf311e8b5106b13f463f8b9. I
>>>> created
>>>> > a new table with the same name through cqlsh. This resulted in
>>>> creation
>>>> > of another directory with a different hash i.e.
>>>> > data_main_bim_dn_10-c146e8d038c611e8b48cb7bc120612c9. I copied all
>>>> data
>>>> > from the former to the latter.
>>>> >
>>>> > Then I ran *"nodetool refresh ks1  data_main_bim_dn_10"*. After that I
>>>> > was able to access all data contents through cqlsh.
>>>> >
>>>> > Now, the problem is, I have around 500 tables and the method I
>>>> mentioned
>>>> > above is quite cumbersome. Bulkloading through sstableloader or remote
>>>> > seeding are also a couple of options but they will take a lot of time.
>>>> > Does anyone know an easier way to shift all my data to new setup on
>>>> DC/OS?
>>>>
>>>> For upgrade support from older versions of C* that did not have the hash
>>>> on the data directory, the table data dir can be just
>>>> `data_main_bim_dn_10` without the appended hash, as in your example.
>>>>
>>>> Give that a quick test to see if that simplifies things for you.
>>>>
>>>> --
>>>> Kind regards,
>>>> Michael
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>>>
>>>>
>>>
>
>
> --
> Faraz Mateen
>



-- 
Faraz Mateen

Re: Shifting data to DCOS

Reply via email to