Got it. Thanks for the responses and the patience.
Tim

-----Original Message-----
From: "Joseph Blomstedt" <j...@basho.com>
Sent: Thursday, January 5, 2012 2:14pm
To: "Tim Robinson" <t...@blackstag.com>
Cc: "Aphyr" <ap...@aphyr.com>, riak-users@lists.basho.com
Subject: Re: Absolute consistency

Internet went down as I was writing an email. Looks like everyone
already did a great job answering the availability issues. Although, I
might as well chime in as a Basho engineer.

On a side note, it looks like we've completely highjacked the
"Absolute consistency" question initially proposed. I'll email out
some thoughts on that later.

> Why would anyone ever want 2 copies on one physical PC?

You wouldn't want 2 copies on one machine. But, if a user requests 3
replicas, and there are only 2 machines, then that third replica has
to go somewhere.

This is a common scenario during initial data loading, development,
and testing/evaluation of Riak. For example, people often start up a
single-node RIak cluster, load up some data, and then add 2 or more
nodes to turn it into a properly sized cluster. Before the new nodes
are added, all the replicas live on the single-node. As nodes are
added, replicas will move to the additional nodes to ensure
availability.

Simply put, unless you have enough machines to hold all your requested
replicas, there isn't much RIak can do for you. You could certainly
argue that perhaps you could merge replicas and only have 2 written in
the reduced node case, and then copy a replica as enough machines are
added to match/exceed the replica count. But, that's additional
complexity for very minor gain. I would rather Riak have a single
easily understood operating mode that you can rely on to understand
your availability guarantees then have alternative operating modes
depending on cluster sizing.

If you want to run a Riak cluster with N=3, you should have 3+ nodes.

With that said, Kyle correctly mentioned edge cases where having more
nodes than N could still leave to reduced availability. Specifically,
if the number of nodes does not cleanly divide the ring size, then
there may be reduced availability preference lists at the wrap-around
point of the Riak ring. For example, a 64 partition ring with 4 nodes
won't have this problem; but a 64 partition ring with 3 nodes may.
This is considered a bug, and is documented at:
https://issues.basho.com/show_bug.cgi?id=228

As listed on the issue, there are operational workarounds that ensure
this doesn't occur. Such as going with 64/4 rather than 64/3. Fixing
the issue entirely is something we at Basho are working towards. The
new ring claim algorithm in the upcoming release of Riak makes the
wrap-around issue much less likely anytime you have more than N nodes.
A future release will address the issue more directly.

-Joe

On Thu, Jan 5, 2012 at 1:12 PM, Tim Robinson <t...@blackstag.com> wrote:
> Thank you for this info. I'm still somewhat confused.
>
> Why would anyone ever want 2 copies on one physical PC? Correct me if I am 
> wrong, but part of the sales pitch for Riak is that the cost of hardware is 
> lessened by distributing your data across a cluster of less expensive 
> machines as opposed to having it all one reside on an enormous server with 
> very little redundancy.
>
> The 2 copies of data on one physical PC provides no redundancy, but increases 
> hardware costs quite a bit.
>
> Right?
>
> Thanks,
> Tim
>
> -----Original Message-----
> From: "Aphyr" <ap...@aphyr.com>
> Sent: Thursday, January 5, 2012 1:01pm
> To: "Tim Robinson" <t...@blackstag.com>
> Cc: "Runar Jordahl" <runar.jord...@gmail.com>, riak-users@lists.basho.com
> Subject: Re: Absolute consistency
>
> On 01/05/2012 11:44 AM, Tim Robinson wrote:
>> Ouch.
>>
>> I'm shocked that is not considered a major bug. At minimum that kind of 
>> stuff should be front and center in their wiki/docs. Here I am thinking n 2 
>> on a 3 node cluster means I'm covered when in fact I am not. It's the whole 
>> reason I gave Riak consideration.
>>
>> Tim
>
> I think you may have this backwards. N=3 and 2 nodes would mean one node
> has 1 copy, and 1 node has 2 copies, of any given piece. For n=2 and 3
> nodes, there should be no overlap.
>
> The other thing to consider is that for certain combinations of
> partition number P and node number N, distributing partitions mod N can
> result in overlaps at the edge of the ring. This means zero to n
> preflists can overlap on some nodes. That means n=3 can, *with the wrong
> choice of N and P*, result in minimum 2 machines having copies of any
> given key, assuming P > N.
>
> There are also failure modes to consider. I haven't read the new key
> balancing algo, so my explanation may be out of date.
>
> --Kyle
>
>
> Tim Robinson
>
>
>
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



-- 
Joseph Blomstedt <j...@basho.com>
Software Engineer
Basho Technologies, Inc.
http://www.basho.com/


Tim Robinson



_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to