riak 2 - how much space is needed for online resizing?

2014-09-02 Thread Max Vernimmen
Hi,

We have 5 riak nodes running riak-2.0.0pre20-1.el6.x86_64 with a ringsize of 
64. We would like to do a ring resize because the distribution of content is 
very uneven (64/5 has a left over of 4 parts that all end up on the same node). 
The documentation says riak-2 can do this online 
(http://docs.basho.com/riak/2.0.0/ops/advanced/ring-resizing/) and warns 'Make 
sure that you have sufficient storage to complete the resize operation'.

Could anyone tell me how much is 'sufficient'?
And in addition, some of the nodes in the cluster have more free space 
available than other nodes (some are at 40% used disk, others at 60%). Is the 
location of the space important?

Thank you,


Max
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


RE: riak 2 - how much space is needed for online resizing?

2014-09-02 Thread Max Vernimmen
Hi Jordan,

Thank you for your response.
With a ringsize of 64 on 5 nodes we have a remainder of 4 parts on one node. 
These parts are 1/64th each, so about 6,25% of the total size ending up on 1 
node in addition to the equally spread out content.
When we would move to 128 parts this would become a remainder of 3, but since 
the ring is 2x as large, each part is only 1/128 in size. So about 2,3% of the 
total size will end up on 1 node in addition to the equally spread out content. 
For 256 it becomes 0,4%.
To me that looks like a very big improvement. You said it wouldn’t make much 
difference, am I making a mistake in my reasoning?

I did a google on claim_v3 and it looks like it could help, so I’m going to try 
it. Rebuilding the cluster is not really an option at this time. Would changing 
the claim method improve the situation for an existing cluster also? I would 
think that because of the automatic rebalancing, the vnodes would move due to 
the different claiming mechanism, am I right?

Yes, now that 2.0.0 has been shipped we are looking to upgrade before making 
any changes.
Thanks,

Max

From: Jordan West [mailto:jw...@basho.com]
Sent: dinsdag 2 september 2014 23:17
To: Max Vernimmen
Cc: riak-users@lists.basho.com
Subject: Re: riak 2 - how much space is needed for online resizing?

Hi Max,

A ring resize won't make things much better. It is intended to change the 
number of partitions from 64, in your case, to 32 or 256, for example. While 
these rings sizes may have better distributions with 5 nodes they will not be 
perfect. The quickest solution using the existing cluster and settings would be 
to add 3 nodes (for a total of 8) or remove one (for a total of 4) -- we don't 
suggest the latter, you can read more about why in [1], but decide based on 
your application's needs. There are a few other options but they are more 
complicated. Somewhat related, since you are using a pre-build, is this 
development/test data? Do you have the option of re-building the cluster? If 
you would like to stick with 5 nodes and can re-build the cluster from scratch, 
another alternative is to try "claim_v3" (the default is v2). See 
wants_claim_fun and choose_claim_fun in [2]. You'll want to set these to 
wants_claim_v3 and choose_claim_v3, repsectively, in the riak_core section of 
your advanced.config. It may result in a better, albeit not perfect, balance.

To answer your original question about capacity, a conservative rule is, below 
50% capacity on every node.

I would also suggest upgrading to a more recent build.

Jordan


[1] http://basho.com/why-your-riak-cluster-should-have-at-least-five-nodes/
[2] http://docs.basho.com/riak/1.4.10/ops/advanced/configs/configuration-files/

On Tue, Sep 2, 2014 at 5:21 AM, Max Vernimmen 
mailto:m.vernim...@comparegroup.eu>> wrote:
Hi,

We have 5 riak nodes running riak-2.0.0pre20-1.el6.x86_64 with a ringsize of 
64. We would like to do a ring resize because the distribution of content is 
very uneven (64/5 has a left over of 4 parts that all end up on the same node). 
The documentation says riak-2 can do this online 
(http://docs.basho.com/riak/2.0.0/ops/advanced/ring-resizing/) and warns 'Make 
sure that you have sufficient storage to complete the resize operation'.

Could anyone tell me how much is ‘sufficient’?
And in addition, some of the nodes in the cluster have more free space 
available than other nodes (some are at 40% used disk, others at 60%). Is the 
location of the space important?

Thank you,


Max

___
riak-users mailing list
riak-users@lists.basho.com<mailto:riak-users@lists.basho.com>
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


replacing node results in error with diag

2014-09-30 Thread Max Vernimmen
Hi,

Today I finished upgrading 2.0.0-pre20 to 2.0.0-1. Once that was done I did a 
node replace according to the instructions at 
http://docs.basho.com/riak/latest/ops/running/nodes/replacing/
Once the replacing was done, our monitoring notified us about a problem with 
the cluster. Our monitoring does a 'riak-admin diag' and each of the nodes is 
now giving the output I've posted here: 
https://gist.github.com/anonymous/a313a07b0cd1da1c
There is a node being referenced in the diag, which is the replaced node. It is 
no longer in the cluster. I confirmed the ring was settled and in the web 
interface of the cluster the replaced node is no longer listed neither is it in 
the `riak-admin status` output. Only a restart of the riak service on each of 
the nodes resolves the problem. Doing a restart on only one node fixes the diag 
status only for that node.

To me it seems like there is some state left in the cluster nodes after a node 
is replaced, causing the `riak-admin diag` command to fail. Has anyone else 
seen this? Would this classify as a bug or did I simply do something wrong ? :)

Best regards,


Max Vernimmen

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


RE: replacing node results in error with diag

2014-09-30 Thread Max Vernimmen
Hi Sargun,

The debug output can be found here: 
https://gist.github.com/anonymous/7e82fa3a62595fbd2cc7
Indeed your suggested command resolves the problem nicely, that saves me a lot 
of restarting. Thank for your help!

Best regards,


Max Vernimmen

> -Original Message-
> From: Sargun Dhillon [mailto:sar...@sargun.me]
> Sent: dinsdag 30 september 2014 21:57
> To: Max Vernimmen
> Cc: riak-users@lists.basho.com
> Subject: Re: replacing node results in error with diag
> 
> So, I don't have a ton of experience with Riaknostic, but taking a
> casual glance at the source code, it appears that Riaknostic caches
> some node-local data about the ring (see:
> https://github.com/basho/riaknostic/blob/2.0.0/src/riaknostic_node.erl#L192
> -L208).
> You should be able to unset this by attaching to a node "riak attach"
> and running application:unset_env(riaknostic, local_stats). --
> although, it'd be nice to get a dump of your local env first for
> debugging purposes, you can get that via io:format("Local env: ~p~n",
> [application:get_all_env(riaknostic)]). (including the period).
> 
> If that clears one node, you can do it on all of your nodes by issuing
> rpc:multicall(application, unset_env, [riaknostic, local_stats]). on
> one node.
> 
> On Tue, Sep 30, 2014 at 12:39 PM, Max Vernimmen
>  wrote:
> > Hi,
> >
> >
> >
> > Today I finished upgrading 2.0.0-pre20 to 2.0.0-1. Once that was done I did
> > a node replace according to the instructions at
> > http://docs.basho.com/riak/latest/ops/running/nodes/replacing/
> >
> > Once the replacing was done, our monitoring notified us about a problem
> with
> > the cluster. Our monitoring does a ‘riak-admin diag’ and each of the nodes
> > is now giving the output I’ve posted here:
> > https://gist.github.com/anonymous/a313a07b0cd1da1c
> >
> > There is a node being referenced in the diag, which is the replaced node. It
> > is no longer in the cluster. I confirmed the ring was settled and in the web
> > interface of the cluster the replaced node is no longer listed neither is it
> > in the `riak-admin status` output. Only a restart of the riak service on
> > each of the nodes resolves the problem. Doing a restart on only one node
> > fixes the diag status only for that node.
> >
> >
> >
> > To me it seems like there is some state left in the cluster nodes after a
> > node is replaced, causing the `riak-admin diag` command to fail. Has anyone
> > else seen this? Would this classify as a bug or did I simply do something
> > wrong ? J
> >
> >
> >
> > Best regards,
> >
> >
> >
> >
> >
> > Max Vernimmen
> >
> >
> >
> >
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


increasing N value

2014-11-03 Thread Max Vernimmen
Hi,

I’d like to increase our cluster’s n value from 2 to n=3. The documentation 
says this is ‘not recommended’ but it doesn’t say why and the functionality is 
there so… ☺
Once I change the setting for a specific bucket There seem to be 2 ways of 
making sure existing objects in the bucket get a 3rd copy:

-  Force a read repair

-  Wait for the Active Anti-Entropy to resolve all missing replicas

Some questions about this:

-  Is there a way to know that AAE is done with a bucket and all 
content is now stored with n=3?

-  If I do a read with r=1 (default?), is there a chance that a node 
will respond with ‘content not found’ and will it be left at that, or will riak 
continue searching for the object on a different node?

-  Will it automatically do a repair when a ‘not found’ is triggered?

I guess what I’m trying to find out is…. What ways are there to make sure all 
content has achieved 3 replica’s after changing to n=3?

Best regards,


Max
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com