One thing that I *think* I've figured out is that the number of "how many
replicas can you lose and stay up" is actually n-w for writes, and n-r for
reads -

So with n=3 and r=2 and w=2, the loss of two replicas due to AZ failure
means that I still *have* my data ("durability") but I might lose _access_
to it ("availability") for a little bit. And with that weird feature that
Riak has (the feature's name escapes me for now?) I might even be able to
write new data if my cluster figures out that the downed nodes are actually
down; I think it just stores the writes on the remaining boxen, and
eventually it gets distributed back once the nodes come back. Neat stuff.

So after working through all of that, I *think* I actually have an argument
I can make for 4 replicas as being somewhat superior to 5. Since I'm on
AWS, I can scale by "embiggening" my nodes for a while, until I hit up to
around the 128GB RAM boxes; then I can start to double-up on AZ's (to keep
things simple, I'd probably go from 4 straight to 8). I would probably - at
that point - start to have to do some math to figure out what new 'n' might
make sense. Maybe n: 5, r: 3, w: 3? I'll cross that bridge when I come to
it (and I know there's all kinds of awful misery with changing 'n' values
in a bucket; forcing read-repairs and all kinds of stuff so that your reads
and writes don't start failing. But again, by then I might have dedicated
minions I could make figure that stuff out). Or maybe there's an inherent
advantage to going straight to 8 instead of just 'embiggening'. Again, I'll
cross that bridge (probably by talking to you all!) when I come to it.

I think the Rack Awareness sounds like a *great* feature - but I'd also
love something that's a little more...strict about making sure that my
replicas never live on the same node (current advice is that you should
have four boxes for an 'n' of 3 to ensure one box doesn't have two copies
of data; I'd love it if at some point they could make that guarantee with
number of boxes=n. I understand it's being worked-on). Once rack-awareness
comes in - or the n=number of boxes fix comes in - I'll probably have to
re-ponder my math. That'll be a good problem for me to have, though :)

-B.


On Tue, Aug 13, 2013 at 8:21 PM, John Eikenberry <j...@zhar.net> wrote:

> Brady Wetherington wrote:
>
> > First off - I know 5 instances is the "magic number" of instances to
> have.
> > If I understand the thinking here, it's that at the default redundancy
> > level ('n'?) of 3, it is most likely to start getting me some scaling
> > (e.g., performance > just that of a single node), and yet also have
> > redundancy; whereby I can lose one box and not start to take a
> performance
> > hit.
>
> With n=3 wouldn't you just need to avoid having more than 2 (of 5) nodes
> in the
> same zone? With 5 nodes you shouldn't have to worry about replicas being
> on the
> same node, so if you only have 2 nodes in 1 zone you wouldn't lose data if
> you
> lost a zone.
>
> The only place I see there being a problem is in regions with only 2 zones
> or
> when you need to expand beyond the 2/zone number. Then you just have to do
> backups and accept that you will suffer an outage if you lose a zone.
>
> The cure for all this is having riak get so called "rack awareness" so you
> can
> configure it to make sure that data is replicated across multiple zones.
> This
> is supposed to be coming at some point [1].
>
> [1] https://github.com/basho/riak/issues/308
>
> > My question is - I think I can only do 4 in a way that makes sense. I
> only
> > have 4 AZ's that I can use right now; AWS won't let me boot instances in
> > 1a. My concern is if I try to do 5, I will be "doubling up" in one AZ -
> and
> > in AWS you're almost as likely to lose an entire AZ as you are a single
> > instance. And so, if I have instances doubled-up in one AZ (let's say
> > us-east-1e), and then I lose 1e, I've now lost two instances. What are
> the
> > chances that all three of my replicas of some chunk of my data are on
> those
> > two instances? I know that it's not guaranteed that all replicas are on
> > separate nodes.
> >
> > So is it better for me to ignore the recommendation of 5 nodes, and just
> do
> > 4? Or to ignore the fact that I might be doubling-up in one AZ? Also,
> > another note. These are designed to be 'durable' nodes, so if one should
> go
> > down I would expect to bring it back up *with* its data - or, if I
> > couldn't, I would do a force-replace or replace and rebuild it from the
> > other replicas. I'm definitely not doing instance-store. So I don't know
> if
> > that mitigates my need for a full 5 nodes. I would also consider losing
> one
> > node to be "degraded" and would probably seek to fix that problem as soon
> > as possible, so I wouldn't expect to be in that situation for long. I
> would
> > probably tolerate a drop in performance during that time, too. (Not a
> > super-severe one, but 20-30 percent? Sure.)
> >
> > What do you folks think?
> >
> > -B.
>
> > _______________________________________________
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
> --
>
> John Eikenberry
> [ j...@zhar.net - http://zhar.net ]
> [ PGP public key @ http://zhar.net/jae_at_zhar_net.gpg ]
> ________________________________________________________________________
> Sic gorgiamus allos subjectatos nunc
>
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to