Hi Chuck, So there is not currently support for listing keys by just issuing a GET to /buckets/bucketname/. Part of the reason for that is - there's many operations to be performed on the bucket resource -- list keys, get bucket properties, etc. That's why you have several URLs to specify what you want to do with the bucket -- /buckets/bucketname/keys, /buckets/bucketname/props, etc. (Very much in keeping with the REST philosophy).
Your best bet, to list keys, is actually the "streaming list keys" method: /buckets/bucketname/keys?keys=stream What you definitely DON'T want to do in production, is use the regular list keys call, /buckets/bucketname/keys?keys=true, which waits to basically build up a giant JSON array of the keys in memory, on the coordinating node, and then send the whole thing back in the reply. As a result, it usually takes way too long, and often ends up eating up all of the memory on a node (if you have a large enough number of keys) and erroring out. The only time where the regular, non-streaming list keys operation is useful is when you're developing on a toy dataset, and want to quickly list the keys in a bucket via curl or in a browser (streaming list keys doesn't work in a browser so well). So, to recap: 1) Listing keys with non-trivial datasets (for use with logical backups, etc) - use streaming list keys. (keys=stream) 2) Listing keys in a browser or by curl, while developing, with toy datasets - use non-streaming keys (keys=true). Dmitri On Thu, Apr 25, 2013 at 12:38 PM, n6mac41717 <c...@stanfordalumni.org> wrote: > I know it's been over two years since this post, and I'm wondering if the > latest version of Riak has made improvements to list keys--I tried the > query > with "keys=true" and I didn't seem to have TSA/octomom-related wait times. > > I was originally hoping that I could get a list of keys via the RESTful API > which led me to this thread. In other words, a GET url/bucket/key will > indeed return what I shoved into the bucket at that key, but I was hoping > that a GET url/bucket (I guess to be truly RESTful, I should make the > bucket > plural) would return the keys. > > Thoughts? > > Thanks in advance, Chuck > > > Alexander Sicular wrote > > Hi Thomas, > > > > This is a topic that has come up many times. Lemme just hit a couple of > > high notes in no particular order: > > > > - If you must do a list keys op on a bucket, you must must must use > > "?keys=stream". True will block on the coordinating node until all nodes > > return their keys. Stream will start sending keys as soon as the first > > node returns. > > > > - "list keys" is one of the most expensive native operations you can > > perform in Riak. Not only does it do a full key scan of all the keys in > > your bucket, but all the keys in your cluster. It is obnoxiously > expensive > > and only more so as the number of keys in your cluster grows. There has > > been discussions about changing this but everything comes with a cost > > (more open file descriptors) and I do not believe a decision has been > made > > yet. > > > > -Riak is in no way a relational system. It is, in fact, about as opposite > > as you can get. Incidentally, "select *" is generally not recommended in > > the Kingdom of Relations and regarded as wasteful. You need a bit of a > > mind shift from relational world to have success with nosql in general > and > > Riak in particular. > > > > -There are no native indices in Riak. By default Riak uses the bitcask > > backend. Bitcask has many advantages but one disadvantage is that all > keys > > (key length + a bit of overhead) must fit in ram. > > > > -Do not use "?keys=true". Your computer will melt. And then your face. > > > > -As of Riak 0.14 your m/r can filter on key name. I would highly > recommend > > that your data architecture take this into account by using keys that > have > > meaningful names. This will allow you to not scan every key in your > > cluster. > > > > -Buckets are analogous to relational tables but only just. In Riak, you > > can think of a bucket as a namespace holder (it is used as part of the > > default circular hash function) but primarily as a mechanism to > > differentiate system settings from one group of keys to the next. > > > > -There is no penalty for unlimited buckets except for when their settings > > deviate from the system defaults. By settings I mean things like hooks, > > replication values and backends among others. > > > > -One should list keys by truth if one enjoys sitting in parking lots on > > the freeway on a scorching summers day or perhaps waiting in a TSA line > at > > your nearest international point of embarkation surrounded by octomom > > families all the while juggling between the grope or the pr0n slideshow. > > If that is for you, use "?keys=true". > > > > -Virtually everything in Riak is transient. Meaning, for the most part > > (not including the 60 seconds or so of m/r cache), there is no caching > > going on in Riak outside of the operating system. Ie. your subsequent > > queries will do more or less the same work as their predecessors. You > need > > to cache your own results if you want to reuse them... quickly. > > > > > > > > Oh, there's more but I'm pretty jelloed from last night. Welcome to the > > fold, Thomas. Can I call you Tom? > > > > Cheers, > > -Alexander Sicular > > > > @siculars > > > > On Jan 22, 2011, at 10:19 AM, Thomas Burdick wrote: > > > >> I've been playing around with riak lately as really my first usage of a > >> distributed key/value store. I quite like many of the concepts and > >> possibilities of Riak and what it may deliver, however I'm really stuck > >> on an issue. > >> > >> Doing the equivalent of a select * from sometable in riak is seemingly > >> slow. As a quick test I tried... > >> > >> http://localhost:8098/riak/mytable?keys=true > >> > >> Before even iterating over the keys this was unbearably slow already. > >> This took almost half a second on my machine where mytable is completely > >> empty! > >> > >> I'm a little baffled, I would assume that getting all the keys of a > table > >> is an incredibly common task? How do I get all the keys of a table > >> quickly? By quickly I mean a few milliseconds or less as I would expect > >> of even a "slow" rdbms with an empty table, even some tables with 1000's > >> of items can get all the primary keys of a sql table in a few > >> milliseconds. > >> > >> Tom Burdick > >> > >> _______________________________________________ > >> riak-users mailing list > >> > > > riak-users@.basho > > >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > > _______________________________________________ > > riak-users mailing list > > > riak-users@.basho > > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > > -- > View this message in context: > http://riak-users.197444.n3.nabble.com/Getting-all-the-Keys-tp2308764p4027757.html > Sent from the Riak Users mailing list archive at Nabble.com. > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com