On Thu, Mar 24, 2011 at 1:51 PM, Nico Meyer <nico.me...@adition.com> wrote:
> The bigger concern for me would be the way the bucket/key tuple is > serialized: > > Eshell V5.8 (abort with ^G) > 1> iolist_size(term_to_binary({<<>>,<<>>})). > 13 > > That's 13 bytes of overhead per key were only 2 bytes is needed with > reasonable bucket/key length limits of 256 bytes each. Or if that is not > enough, one could also use a variable length encoding, so bucket/keys > can be arbitrarily large and the most common cases (less then 128 bytes) > still only use 2 bytes of overhead. I've made a branch of bitcask that effectively does this. It uses 3 bytes per record instead of 13, saving 10 bytes (both in RAM and on disk) per element stored. The tricky thing, however, is backward compatibility. There are many Riak installations out there with data stored in bitcask using the old key encoding, and we shouldn't force them all to do a very costly full-sweep of their existing data in order to get these savings. When we sort out the best way to manage a smooth upgrade, I would happily push out the smaller encoding. -Justin _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com