Re: Expected vs Actual Bucket Behavior

Daniel Einspanjer Tue, 20 Jul 2010 23:22:20 -0700

 On 7/20/10 6:00 PM, Eric Filson wrote:

On Tue, Jul 20, 2010 at 3:02 PM, Justin Sheehy <jus...@basho.com<mailto:jus...@basho.com>> wrote:


    Hi, Eric!  Thanks for your thoughts.

    On Tue, Jul 20, 2010 at 12:39 PM, Eric Filson <efil...@gmail.com
    <mailto:efil...@gmail.com>> wrote:

    > I would think that this requirement,
    > retrieving all objects in a bucket, to be a _very_ common
    > place occurrence for modern web development and perhaps
    (depending on
    > requirements) _the_ most common function aside from retrieving a
    single k/v
    > pair.

    I tend to see people that mostly try to write applications that don't
    select everything from a whole bucket/table/whatever as a very
    frequent occurrence, but different people have different requirements.
     Certainly, it is sometimes unavoidable.


Indeed, in my case it is :(

I've had two use cases that bumped into this limitation. In one, we arejust working around / accepting the limitation. In the other, we foundit much easier/safer to consider a different solution entirely.

    > I might recommend a hybrid
    > solution (based in my limited knowledge of Riak)... What about
    allowing a
    > bucket property named something like "key_index" that points to
    a key
    > containing a value of "keys in bucket".  Then, when calling GET
    > /riak/bucket, Riak would use the key_index to immediately reduce
    its result
    > set before applying m/r funcs.  While I understand this is
    essentially what
    > a developer would do, it would certainly alleviate some code
    requirements
    > (application side) as well as make the behavior of retrieving a
    bucket's
    > contents more "expected" and efficient.

    A much earlier incarnation of Riak actually stored bucket keylists
    explicitly in a fashion somewhat like what you describe.  We removed
    this as one of our biggest goals is predictable and understandable
    behavior in a distributed systems sense, and a model like this one
    turns each write operation into at least two operations.  This isn't
    just a performance issue, but also adds complexity.  For instance, it
    is not immediately obvious what should be returned to the client if a
    data item write succeeds, but the read/write of the index fails?
Haha, these are the exact reasons I would cite as a developer forusing a similar method on Riak's side... without the option of autobucket indexing it effectively places this double write into theapplication side where it requires more cycles and more data acrossthe wire. Instead of doing a single write, from the application side,and allowing Riak to handle this, you have to GET index_key, UPDATEindex_key, ADD new_key... So rather than having a single transactionwith Riak, you have to have three transactions with Riak + Applicationfunctionality. Inherently, this adds another level of complexity intothe application code base for something that could be done moreefficiently by the DB engine itself.
I would think a separate error number and message would suffice as areturn error, obviously though, this would require developers beingmade aware so they can code for the exception.
Also, this would be optional, if the index_key wasn't set for thebucket then this setup wouldn't be used. This would at least make thesystem more flexible to the application requirements and developerpreferences.

I understand that there may be people using Riak who either never intendto have a huge number of keys in the cluster, or who never intend to tryto map reduce over a bucket if they do.I also understand that there are performance and complexity wins to behad by eliminating the feature.

That said, I feel it needs to be an optional feature that the engineitself provides. Pushing it out to the client layer severelycomplicates the transaction because it is now two separate REST callsrather than something that can be done in a tightly coupled fashion onthe node servicing the request.


-Daniel

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Expected vs Actual Bucket Behavior

Reply via email to