Regarding #2, I think bitcask could be modified to support an efficient list keys by bucket fairly easily, without sacrificing free buckets:
The current bitcask stores record locators (key, file_id, file_offset) in memory in a big hash table by key (the bitcask key, in Riak's case, is the Riak {bucket,key} as a binary). What if the hash table were replaced with an in-memory btree? A good implementation shouldn't take more memory than a hash table, and get/put should still be very fast. The plus side is that one could then do a range traversal of the btree to get all keys in a given bucket (assuming the right comparison function for the btree). There wouldn't be any additional overhead of extra file handles, etc. because everything for a vnode would still be stored in one bitcask instance. What do you think? Curtis On Wed, Jul 21, 2010 at 6:31 AM, Justin Sheehy <jus...@basho.com> wrote: > I think that we are all (myself included) getting two different issues > a bit mixed up in this discussion: > > 1: storing an implicit index of keys in the Riak key/value store > > 2: making buckets separate in that a per-bucket operation's > performance would not be affected by the content of other buckets > > The thread started out with a request for #2, but included a > suggestion to do #1. These are actually two different topics. > > The first issue, implicitly storing a big index of keys, is > impractical in a distributed key/value storage system that has Riak's > availability goals. We are very unlikely to implement this as > described in the near future. However, we very much recognize that > there are many different ways that people would like to find their > data. In that light, we are working on multiple different efforts > that will use the Riak core to provide data storage with more than > just "simple" key/value access. > > The second issue, of isolating buckets, is a much simpler design > choice and is also a per-backend implementation detail. We can create > and provide an alternative bitcask adapter that does this. It will be > a real tradeoff: in exchange for buckets not impacting each other as > much, the system will consume more filehandles, be a bit less > efficient at rebalancing, and will generally make buckets no longer > "free". This is a reasonable tradeoff in either direction for various > applications, and I support making it available as a choice. I have > created a bugzilla entry to track it: > https://issues.basho.com/show_bug.cgi?id=480 > > I hope that this helps to clarify the issue. > > -Justin > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com