> These are taken just before starting shuffle (ran repair/cleanup the day 
> before).
> During shuffle disabled all reads/writes to the cluster.
> 
> nodetool status keyspace:
> 
> Load       Tokens  Owns (effective)  Host ID
> 80.95 GB   256     16.7%             754f9f4c-4ba7-4495-97e7-1f5b6755cb27

I'm a little confused when nodetool status was showing 256 tokens before the 
shuffle was run. 

Did you set num_tokens during the upgrade process ? But I doubt that would 
change anything as the inital_token is set in the system tables during 
bootstrap. 

`bin/cassandra-shuffle  ls` will show the list of moves the shuffle process 
was/is going to run. What does that say?

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 30/04/2013, at 5:08 AM, John Watson <j...@disqus.com> wrote:

> That's what we tried first before the shuffle. And ran into the space issue.
> 
> That's detailed in another thread title: "Adding nodes in 1.2 with vnodes 
> requires huge disks"
> 
> 
> On Mon, Apr 29, 2013 at 4:08 AM, Sam Overton <s...@acunu.com> wrote:
> An alternative to running shuffle is to do a rolling bootstrap/decommission. 
> You would set num_tokens on the existing hosts (and restart them) so that 
> they split their ranges, then bootstrap in N new hosts, then decommission the 
> old ones.
> 
> 
> 
> On 28 April 2013 22:21, John Watson <j...@disqus.com> wrote:
> The amount of time/space cassandra-shuffle requires when upgrading to using 
> vnodes should really be apparent in documentation (when some is made).
> 
> Only semi-noticeable remark about the exorbitant amount of time is a bullet 
> point in: http://wiki.apache.org/cassandra/VirtualNodes/Balance
> 
> "Shuffling will entail moving a lot of data around the cluster and so has the 
> potential to consume a lot of disk and network I/O, and to take a 
> considerable amount of time. For this to be an online operation, the shuffle 
> will need to operate on a lower priority basis to other streaming operations, 
> and should be expected to take days or weeks to complete."
> 
> We tried running shuffle on a QA version of our cluster and 2 things were 
> brought to light:
>  - Even with no reads/writes it was going to take 20 days
>  - Each machine needed enough free diskspace to potentially hold the entire 
> cluster's sstables on disk
> 
> Regards,
> 
> John
> 
> 
> 
> -- 
> Sam Overton
> Acunu | http://www.acunu.com | @acunu
> 

Reply via email to