Problems with shuffle

Rustam Aliyev Sun, 07 Apr 2013 05:44:07 -0700

Hi,

After upgrading to the vnodes I created and enabled shuffle operation assuggested. After running for a couple of hours I had to disable itbecause nodes were not catching up with compactions. I repeated thisprocess 3 times (enable/disable).

I have 5 nodes and each of them had ~35GB. After shuffle operationsdescribed above some nodes are now reaching ~170GB. In the log files Ican see same files transferred 2-4 times to the same host within thesame shuffle session. Worst of all, after all of these I had only 20vnodes transferred out of 1280. So if it will continue at the same speedit will take about a month or two to complete shuffle.


I had few question to better understand shuffle:

1. Does disabling and re-enabling shuffle starts shuffle process from
   scratch or it resumes from the last point?

2. Will vnode reallocations speedup as shuffle proceeds or it will
   remain the same?

3. Why I see multiple transfers of the same file to the same host? e.g.:

   INFO [Streaming to /10.0.1.8:6] 2013-04-07 14:27:10,038
   StreamReplyVerbHandler.java (line 44) Successfully sent
   /u01/cassandra/data/Keyspace/Metadata/Keyspace-Metadata-ib-111-Data.db
   to /10.0.1.8
   INFO [Streaming to /10.0.1.8:7] 2013-04-07 16:27:07,427
   StreamReplyVerbHandler.java (line 44) Successfully sent
   /u01/cassandra/data/Keyspace/Metadata/Keyspace-Metadata-ib-111-Data.db
   to /10.0.1.8

4. When I enable/disable shuffle I receive warning message such as
   below. Do I need to worry about it?

   cassandra-shuffle -h localhost disable
   Failed to enable shuffling on 10.0.1.1!
   Failed to enable shuffling on 10.0.1.3!

I couldn't find many docs on shuffle, only read through JIRA andoriginal proposal by Eric.


BR,
Rustam.

Problems with shuffle

Reply via email to