Hi Thomas, This is a topic that has come up many times. Lemme just hit a couple of high notes in no particular order:
- If you must do a list keys op on a bucket, you must must must use "?keys=stream". True will block on the coordinating node until all nodes return their keys. Stream will start sending keys as soon as the first node returns. - "list keys" is one of the most expensive native operations you can perform in Riak. Not only does it do a full key scan of all the keys in your bucket, but all the keys in your cluster. It is obnoxiously expensive and only more so as the number of keys in your cluster grows. There has been discussions about changing this but everything comes with a cost (more open file descriptors) and I do not believe a decision has been made yet. -Riak is in no way a relational system. It is, in fact, about as opposite as you can get. Incidentally, "select *" is generally not recommended in the Kingdom of Relations and regarded as wasteful. You need a bit of a mind shift from relational world to have success with nosql in general and Riak in particular. -There are no native indices in Riak. By default Riak uses the bitcask backend. Bitcask has many advantages but one disadvantage is that all keys (key length + a bit of overhead) must fit in ram. -Do not use "?keys=true". Your computer will melt. And then your face. -As of Riak 0.14 your m/r can filter on key name. I would highly recommend that your data architecture take this into account by using keys that have meaningful names. This will allow you to not scan every key in your cluster. -Buckets are analogous to relational tables but only just. In Riak, you can think of a bucket as a namespace holder (it is used as part of the default circular hash function) but primarily as a mechanism to differentiate system settings from one group of keys to the next. -There is no penalty for unlimited buckets except for when their settings deviate from the system defaults. By settings I mean things like hooks, replication values and backends among others. -One should list keys by truth if one enjoys sitting in parking lots on the freeway on a scorching summers day or perhaps waiting in a TSA line at your nearest international point of embarkation surrounded by octomom families all the while juggling between the grope or the pr0n slideshow. If that is for you, use "?keys=true". -Virtually everything in Riak is transient. Meaning, for the most part (not including the 60 seconds or so of m/r cache), there is no caching going on in Riak outside of the operating system. Ie. your subsequent queries will do more or less the same work as their predecessors. You need to cache your own results if you want to reuse them... quickly. Oh, there's more but I'm pretty jelloed from last night. Welcome to the fold, Thomas. Can I call you Tom? Cheers, -Alexander Sicular @siculars On Jan 22, 2011, at 10:19 AM, Thomas Burdick wrote: > I've been playing around with riak lately as really my first usage of a > distributed key/value store. I quite like many of the concepts and > possibilities of Riak and what it may deliver, however I'm really stuck on an > issue. > > Doing the equivalent of a select * from sometable in riak is seemingly slow. > As a quick test I tried... > > http://localhost:8098/riak/mytable?keys=true > > Before even iterating over the keys this was unbearably slow already. This > took almost half a second on my machine where mytable is completely empty! > > I'm a little baffled, I would assume that getting all the keys of a table is > an incredibly common task? How do I get all the keys of a table quickly? By > quickly I mean a few milliseconds or less as I would expect of even a "slow" > rdbms with an empty table, even some tables with 1000's of items can get all > the primary keys of a sql table in a few milliseconds. > > Tom Burdick > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com