Agreed. In fact, jrecursive pointed out to me last week that vnode operations are synchronous. That means that when you call list-keys, not only is it going to take a long time (right now upwards of 5 minutes) to complete, but while each vnode is returning its list of keys *it blocks any other requests*.

While list-keys is an unfortunate necessity for some things, its use should be minimized if you're going to get to any appreciable (100M keys) scale. I don't even know how we're going to use it at all above a billion. Possibly by listing the keys periodically from bitcask directly, and maintaining an index ourselves.

--Kyle

On 05/26/2011 09:40 AM, Sean Cribbs wrote:
With recent commits (
https://github.com/seancribbs/ripple/compare/35d7323fb0e179c8c971...da3ab71a19d194c65a7b
<https://github.com/seancribbs/ripple/compare/35d7323fb0e179c8c971..da3ab71a19d194c65a7b>
), it is cached until you either refresh it manually by passing :reload
=> true or a block (for streaming key lists). This was the compromise
reached in that pull-request.

All of this caching discussion glosses over the fact that you *should
not list keys* in any real application. It really begs the question --
how often do you list keys in Redis, or memcached? I suspect that
generally you don't. This isn't a relational database. (Also, how often
do you actually do a full-table scan in MySQL? You don't if you're sane
-- you use an index, or even LIMIT + OFFSET.)

I'm tempted to remove Document::all and make Bucket#keys harder to
access, but the balance between discouraging bad behavior and exposing
available functionality is a hard one to strike. I don't want new
developers to immediately use list-keys and then be discouraged from
using Riak because it's slow; on the other hand, it /can be useful/ in
some circumstances. In those cases where it's useful, the developer
should probably be responsible enough to request the key list only once;
the caching behavior simply does this for them. I guess whether it
/should/ do this for them is the issue at hand.

All that said, I'm really torn on this issue, and the same problem
applies to full-bucket MapReduce. Caveat emptor.

Sean Cribbs <s...@basho.com <mailto:s...@basho.com>>
Developer Advocate
Basho Technologies, Inc.
http://basho.com/

On May 26, 2011, at 10:35 AM, Jonathan Langevin wrote:

How long is the key list cached like that, naturally?*

<http://www.loomlearning.com/>
        */
/*Jonathan Langevin*/
Systems Administrator
*Loom Inc.*
Wilmington, NC: (910) 241-0433 - jlange...@loomlearning.com
<mailto:jlange...@loomlearning.com> - www.loomlearning.com
<http://www.loomlearning.com/> - Skype: intel352

/*

*


On Thu, May 26, 2011 at 10:35 AM, Sean Cribbs <s...@basho.com
<mailto:s...@basho.com>> wrote:

    Keith,

    There was a pull-request issue out for this on the Github project
    (https://github.com/seancribbs/ripple/pull/168). For various
    reasons, the list of keys is memoized in the Riak::Bucket
    instance. Passing :reload => true to the #keys method will cause
    it to refresh. I like to discourage list-keys, but with the
    memoized list you don't shoot yourself in the foot as often.

    Sean Cribbs <s...@basho.com <mailto:s...@basho.com>>
    Developer Advocate
    Basho Technologies, Inc.
    http://basho.com/

    On May 26, 2011, at 10:29 AM, Keith Bennett wrote:

    > All -
    >
    > I just started working with Riak, and am using the riak-client
    Ruby gem.
    >
    > When I delete a key from a bucket, and try to fetch the value
    associated with that key, I get a 404 error (which is reasonable).
    However, it remains in the bucket's list of keys (i.e. the value
    returned by bucket.keys(). Why is the key still reported to exist
    in the bucket? Is bucket.keys cached, and therefore unaware of the
    deletion? Here's a riak-client Ruby script and its output in irb
    that illustrates this:
    >
    > ree-1.8.7-2010.02 :001 > require 'riak'
    > => true
    > ree-1.8.7-2010.02 :002 >
    > ree-1.8.7-2010.02 :003 > client = Riak::Client.new
    > => #<Riak::Client http://127.0.0.1:8098 <http://127.0.0.1:8098/>>
    > ree-1.8.7-2010.02 :004 > bucket = client['links']
    > => #<Riak::Bucket {links}>
    > ree-1.8.7-2010.02 :005 > key = bucket.keys.first
    > => "4000-17.xml"
    > ree-1.8.7-2010.02 :006 > object = bucket[key]
    > => #<Riak::RObject {links,4000-17.xml} [text/xml]:(6430 bytes)>
    > ree-1.8.7-2010.02 :007 > object.delete
    > => #<Riak::RObject {links,4000-17.xml} [text/xml]:(6430 bytes)>
    > ree-1.8.7-2010.02 :008 > bucket.keys.first
    > => "4000-17.xml"
    > ree-1.8.7-2010.02 :009 > object = bucket[key]
    > Riak::HTTPFailedRequest: Expected [200, 300] from Riak but
    received 404. not found
    >
    > from
    
/Users/kbennett/.rvm/gems/ree-1.8.7-2010.02/gems/riak-client-0.9.4/lib/riak/client/net_http_backend.rb:55:in
    `perform'
    > from
    
/Users/kbennett/.rvm/rubies/ree-1.8.7-2010.02/lib/ruby/1.8/net/http.rb:1054:in
    `request'
    > from
    
/Users/kbennett/.rvm/rubies/ree-1.8.7-2010.02/lib/ruby/1.8/net/http.rb:2142:in
    `reading_body'
    > from
    
/Users/kbennett/.rvm/rubies/ree-1.8.7-2010.02/lib/ruby/1.8/net/http.rb:1053:in
    `request'
    > from
    
/Users/kbennett/.rvm/rubies/ree-1.8.7-2010.02/lib/ruby/1.8/net/http.rb:1037:in
    `request'
    > from
    
/Users/kbennett/.rvm/rubies/ree-1.8.7-2010.02/lib/ruby/1.8/net/http.rb:543:in
    `start'
    > from
    
/Users/kbennett/.rvm/rubies/ree-1.8.7-2010.02/lib/ruby/1.8/net/http.rb:1035:in
    `request'
    > from
    
/Users/kbennett/.rvm/gems/ree-1.8.7-2010.02/gems/riak-client-0.9.4/lib/riak/client/net_http_backend.rb:47:in
    `perform'
    > from
    
/Users/kbennett/.rvm/gems/ree-1.8.7-2010.02/gems/riak-client-0.9.4/lib/riak/client/net_http_backend.rb:46:in
    `tap'
    > from
    
/Users/kbennett/.rvm/gems/ree-1.8.7-2010.02/gems/riak-client-0.9.4/lib/riak/client/net_http_backend.rb:46:in
    `perform'
    > from
    
/Users/kbennett/.rvm/gems/ree-1.8.7-2010.02/gems/riak-client-0.9.4/lib/riak/client/http_backend/transport_methods.rb:59:in
    `get'
    > from
    
/Users/kbennett/.rvm/gems/ree-1.8.7-2010.02/gems/riak-client-0.9.4/lib/riak/client/http_backend.rb:72:in
    `fetch_object'
    > from
    
/Users/kbennett/.rvm/gems/ree-1.8.7-2010.02/gems/riak-client-0.9.4/lib/riak/bucket.rb:101:in
    `[]'
    > from riak-delete-failure.rb:9
    >
    > Thanks,
    > Keith
    >
    >
    >
    > _______________________________________________
    > riak-users mailing list
    > riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
    >
    http://lists.basho.com/mailman/listinfo/riak-users_listsbasho.com
    <http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>


    _______________________________________________
    riak-users mailing list
    riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
    http://lists.basho.com/mailman/listinfo/riak-users_listsbasho.com
    <http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>





_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to