Re: I don't understand shuffle progress

David McNelis Tue, 17 Sep 2013 19:18:17 -0700

As Rob mentioned, no one (myself included) has successfully used shuffle in
the wild (that I've heard of).


Shuffle is *supposed* to be a transparent background process... and is
designed, in theory, to take a long time to run (weeks is the right way to
think of it).

Be sure to keep an eye on your drive space if you are going to wait it out.
 Unless you have < 1/2 of your drives in use you are going to need to run
cleanup periodically to avoid running out of disk space, because shuffle
NEVER removes data, only makes copies of the data on the new destination
nodes.

I think that is the area that people tend to see the most failures, because
the newer versions of cassandra can survive OK with more than 1/2 the disk
in use, more and more people are using > 50% of their disks.


On Tue, Sep 17, 2013 at 9:41 PM, Paulo Motta <pauloricard...@gmail.com>wrote:

> That is very disappointing to hear. Vnodes support is one of the main
> reasons we're upgrading from 1.1.X to 1.2.X.
>
> So you're saying the only feasible way of enabling VNodes on an upgraded
> C* 1.2 is by doing fork writes to a brand new cluster + bulk load of
> sstables from the old cluster? Or is it possible to succeed on shuffling,
> even if that means waiting some weeks for the shuffle to complete?
>
>
> 2013/9/17 Robert Coli <rc...@eventbrite.com>
>
>> On Tue, Sep 17, 2013 at 4:00 PM, Juan Manuel Formoso <jform...@gmail.com
>> >wrote:
>>
>> > Any better alternatives than creating a small application that reads
>> from
>> > one cluster and inserts in the new one that anybody can suggest?
>> >
>> >
>> http://www.palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra
>>
>> In theory if you wanted to do the "copy-the-files" method while enabling
>> vnodes on the target cluster, you could :
>>
>> 1) create new target cluster with vnodes enabled
>> 2) fork writes so they go to both source and target cluster
>> 3) copy 100% of sstables from all source nodes to all target nodes (being
>> sure to ensure non-collision of sstables of names, probably by adding a
>> few
>> hundreds/thousands to the sequence of various nodes in a predictable
>> fashion)
>> 4) be certain that you did not accidentally resurrect data from purged
>> source sstables in 3)
>> 5) run cleanup compaction on all nodes in target cluster
>> 6) turn off writes to old source cluster
>>
>> =Rob
>> * notes that this process would make a good blog post.. :D
>>
>
>
>
> --
> Paulo Ricardo
>
> --
> European Master in Distributed Computing***
> Royal Institute of Technology - KTH
> *
> *Instituto Superior Técnico - IST*
> *http://paulormg.com*
>

Re: I don't understand shuffle progress

Reply via email to