I think you can avoid listing all keys in a bucket by maintaining a
separate object that contains a list of the current keys. I usually
append the keys to a "/bucket/_collection" object.
On Thu, Jan 13, 2011 at 9:27 AM, Sean Cribbs <s...@basho.com> wrote:
>>
>> Unfortunately, even if additional nodes yield linear performance
>> gains, the m/r overhead seems very large -- if I'm getting 1.5 seconds
>> to process 1,000 items on one node, it seems apparent that I should
>> get roughtly 1.5 seconds to process 3,000 items on 3 nodes, which
>> still is awfully slow.
>>
>> Do you know how Riak compares to HBase, MongoDB or Cassandra for large
>> dataset processing and analysis with m/r, when talking hundreds of
>> millions, or even billions of keys? It would seem that key traversal
>> performance would preventing Riak from competing in that space. Maybe
>> you could do something with Riak Search, but I'm not sure if it would
>> comparable.
>
> To be fair, you can't do a microbenchmark and then try to extrapolate it to 
> large datasets; things change at scale. Also, key-listing has been a known 
> limitation of Riak for a long time, and one we have been quite vocal about. 
> There have been improvements recently, but it's still an O(N) computation 
> where N is the total number of keys stored in the cluster. Therefore, it's 
> important to structure your data such that you limit the use of key lists. 
> Compare performance after you have done that, and run your benchmark on 
> something other than a single node (4 or more in a cluster is best), with a 
> dataset that approximates the target size.
>
> Sean Cribbs <s...@basho.com>
> Developer Advocate
> Basho Technologies, Inc.
> http://basho.com/
>
>
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to