Hi Thomas,

This is a topic that has come up many times. Lemme just hit a couple of high 
notes in no particular order:

- If you must do a list keys op on a bucket, you must must must use 
"?keys=stream". True will block on the coordinating node until all nodes return 
their keys. Stream will start sending keys as soon as the first node returns.

- "list keys" is one of the most expensive native operations you can perform in 
Riak. Not only does it do a full key scan of all the keys in your bucket, but 
all the keys in your cluster. It is obnoxiously expensive and only more so as 
the number of keys in your cluster grows. There has been discussions about 
changing this but everything comes with a cost (more open file descriptors) and 
I do not believe a decision has been made yet.

-Riak is in no way a relational system. It is, in fact, about as opposite as 
you can get. Incidentally, "select *" is generally not recommended in the 
Kingdom of Relations and regarded as wasteful. You need a bit of a mind shift 
from relational world to have success with nosql in general and Riak in 
particular.

-There are no native indices in Riak. By default Riak uses the bitcask backend. 
Bitcask has many advantages but one disadvantage is that all keys (key length + 
a bit of overhead) must fit in ram. 

-Do not use "?keys=true". Your computer will melt. And then your face.

-As of Riak 0.14 your m/r can filter on key name. I would highly recommend that 
your data architecture take this into account by using keys that have 
meaningful names. This will allow you to not scan every key in your cluster.

-Buckets are analogous to relational tables but only just. In Riak, you can 
think of a bucket as a namespace holder (it is used as part of the default 
circular hash function) but primarily as a mechanism to differentiate system 
settings from one group of keys to the next. 

-There is no penalty for unlimited buckets except for when their settings 
deviate from the system defaults. By settings I mean things like hooks, 
replication values and backends among others.

-One should list keys by truth if one enjoys sitting in parking lots on the 
freeway on a scorching summers day or perhaps waiting in a TSA line at your 
nearest international point of embarkation surrounded by octomom families all 
the while juggling between the grope or the pr0n slideshow. If that is for you, 
use "?keys=true".

-Virtually everything in Riak is transient. Meaning, for the most part (not 
including the 60 seconds or so of m/r cache), there is no caching going on in 
Riak outside of the operating system. Ie. your subsequent queries will do more 
or less the same work as their predecessors. You need to cache your own results 
if you want to reuse them... quickly.



Oh, there's more but I'm pretty jelloed from last night. Welcome to the fold, 
Thomas. Can I call you Tom?

Cheers,
-Alexander Sicular

@siculars

On Jan 22, 2011, at 10:19 AM, Thomas Burdick wrote:

> I've been playing around with riak lately as really my first usage of a 
> distributed key/value store. I quite like many of the concepts and 
> possibilities of Riak and what it may deliver, however I'm really stuck on an 
> issue.
> 
> Doing the equivalent of a select * from sometable in riak is seemingly slow. 
> As a quick test I tried...
> 
> http://localhost:8098/riak/mytable?keys=true
> 
> Before even iterating over the keys this was unbearably slow already. This 
> took almost half a second on my machine where mytable is completely empty! 
> 
> I'm a little baffled, I would assume that getting all the keys of a table is 
> an incredibly common task?  How do I get all the keys of a table quickly? By 
> quickly I mean a few milliseconds or less as I would expect of even a "slow" 
> rdbms with an empty table, even some tables with 1000's of items can get all 
> the primary keys of a sql table in a few milliseconds.
> 
> Tom Burdick
> 
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to