On 7/20/10 6:00 PM, Eric Filson wrote:
On Tue, Jul 20, 2010 at 3:02 PM, Justin Sheehy <jus...@basho.com <mailto:jus...@basho.com>> wrote:

    Hi, Eric!  Thanks for your thoughts.

    On Tue, Jul 20, 2010 at 12:39 PM, Eric Filson <efil...@gmail.com
    <mailto:efil...@gmail.com>> wrote:

    > I would think that this requirement,
    > retrieving all objects in a bucket, to be a _very_ common
    > place occurrence for modern web development and perhaps
    (depending on
    > requirements) _the_ most common function aside from retrieving a
    single k/v
    > pair.

    I tend to see people that mostly try to write applications that don't
    select everything from a whole bucket/table/whatever as a very
    frequent occurrence, but different people have different requirements.
     Certainly, it is sometimes unavoidable.


Indeed, in my case it is :(
I've had two use cases that bumped into this limitation. In one, we are just working around / accepting the limitation. In the other, we found it much easier/safer to consider a different solution entirely.

    > I might recommend a hybrid
    > solution (based in my limited knowledge of Riak)... What about
    allowing a
    > bucket property named something like "key_index" that points to
    a key
    > containing a value of "keys in bucket".  Then, when calling GET
    > /riak/bucket, Riak would use the key_index to immediately reduce
    its result
    > set before applying m/r funcs.  While I understand this is
    essentially what
    > a developer would do, it would certainly alleviate some code
    requirements
    > (application side) as well as make the behavior of retrieving a
    bucket's
    > contents more "expected" and efficient.

    A much earlier incarnation of Riak actually stored bucket keylists
    explicitly in a fashion somewhat like what you describe.  We removed
    this as one of our biggest goals is predictable and understandable
    behavior in a distributed systems sense, and a model like this one
    turns each write operation into at least two operations.  This isn't
    just a performance issue, but also adds complexity.  For instance, it
    is not immediately obvious what should be returned to the client if a
    data item write succeeds, but the read/write of the index fails?


Haha, these are the exact reasons I would cite as a developer for using a similar method on Riak's side... without the option of auto bucket indexing it effectively places this double write into the application side where it requires more cycles and more data across the wire. Instead of doing a single write, from the application side, and allowing Riak to handle this, you have to GET index_key, UPDATE index_key, ADD new_key... So rather than having a single transaction with Riak, you have to have three transactions with Riak + Application functionality. Inherently, this adds another level of complexity into the application code base for something that could be done more efficiently by the DB engine itself.

I would think a separate error number and message would suffice as a return error, obviously though, this would require developers being made aware so they can code for the exception.

Also, this would be optional, if the index_key wasn't set for the bucket then this setup wouldn't be used. This would at least make the system more flexible to the application requirements and developer preferences.

I understand that there may be people using Riak who either never intend to have a huge number of keys in the cluster, or who never intend to try to map reduce over a bucket if they do. I also understand that there are performance and complexity wins to be had by eliminating the feature.

That said, I feel it needs to be an optional feature that the engine itself provides. Pushing it out to the client layer severely complicates the transaction because it is now two separate REST calls rather than something that can be done in a tightly coupled fashion on the node servicing the request.

-Daniel
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to