Re: Clarifying "Read-before-Write"

Russell Brown Sat, 26 Nov 2011 00:40:08 -0800

On 26 Nov 2011, at 01:14, Andres Jaan Tack wrote:

> So I was just reading and thinking about this, and I don't understand the 
> advice offered under "Read-before-Write" at 
> http://wiki.basho.com/Client-Implementation-Guide.html.
> 
> "Riak will return an encoded vector clock with every "fetch" or "read" 
> request that does not result in a "not found" response. In addition to the 
> Client ID, this vector clock tells Riak how to resolve concurrent writes, 
> essentially representing the "last seen" version of the object to which the 
> client made modifications. In order to prevent sibling explosion, clients 
> should always have a vector clock before sending a write, and send the vector 
> clock as part of the write request. Therefore, it is essential that keys are 
> fetched before being written (except in the case where Riak selects the key 
> or there is a priori knowledge that the key is new). Client libraries that 
> make this automatic will reduce operational issues by limiting sibling 
> explosion. Clients may also choose to perform automatic Sibling Resolution on 
> read."
>  
> I'm having trouble understanding the advice. I get that if I'm aware of all 
> the siblings, I can resolve them (optionally) with that vector clock. What I 
> don't understand here: If an application PUTs to an object out of the blue, 
> not having read it first, should the client library read-before-write?


Yes it should.

> This seems like a great way to blow away siblings by accident. 

But it should never do that, if siblings are encountered, it should *do* 
something.

> Or is the point rather to avoid sibling explosion for applications that don't 
> care about losing information?

A well behaved client library will not blindly PUT a value "over the top" of 
siblings, but will push the problem to the library user (hopefully in some 
helpful way, like automatically applying some domain specific resolution 
logic.) 

So, in the case of the Java client, when you store (or fetch for that matter) 
you must provide an implementation of the ConflictResolver<T> interface to the 
client, this will then be executed to resolve any siblings on the pre-store 
fetch. If you don't provide a conflict resolver the Java client uses one that 
throws a runtime exception when it encounters siblings on fetch, exactly so 
that you don't do as you describe, and blow away potentially meaningful sibling 
values.

Maybe the wording on the wiki should make this clearer, maybe it should read:

"Clients [that automatically fetch before store] _must_ chose to either perform 
automatic Sibling Resolution *or* abort the write and notify the presence of 
siblings to the caller"

It is a thorny issue, please let me know if I've answered your question 
adequately.

Cheers

Russell

> 
> --
> Andres
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Clarifying "Read-before-Write"

Reply via email to