The bucket/key pair is passed around in a 2-tuple: https://github.com/basho/riak_kv/blob/0093af40f8ba97038e98dd04dfea70ef889ff213/src/riak_kv_put_fsm.erl#L84
<https://github.com/basho/riak_kv/blob/0093af40f8ba97038e98dd04dfea70ef889ff213/src/riak_kv_put_fsm.erl#L116>Each backend can manage the bucket/key pair however it wants. For example, the Bitcask backend uses term_to_binary/1 to convert the bucket/key pair to a single key: https://github.com/basho/riak_kv/blob/master/src/riak_kv_bitcask_backend.erl#L98 When the backend lists keys in a bucket, it can extract the bucket name from the key term: h<https://github.com/basho/riak_kv/blob/master/src/riak_kv_bitcask_backend.erl#L123> ttps://github.com/basho/riak_kv/blob/master/src/riak_kv_bitcask_backend.erl#L123<https://github.com/basho/riak_kv/blob/master/src/riak_kv_bitcask_backend.erl#L123> Thanks, Dan Daniel Reverri Developer Advocate Basho Technologies, Inc. d...@basho.com On Mon, Nov 15, 2010 at 2:11 PM, Alexander Sicular <sicul...@gmail.com>wrote: > So I get that riak is not bucket aware. When you pass a bucket as an > input in an m/r, as riak sifts through all the keys, how does riak > isolate bucket specific keys? Are keys stored as /bucket/key internaly > and there is a string comparison on split(key,'/') ? Or is there > something else going on. > > Thank you. > > > > On 2010-11-15, Kevin Smith <ksm...@basho.com> wrote: > > We are giving some thought on how to do that. The main issues wrt to > > bitcask's key listing performance is that bitcask is not bucket aware and > > lacks the notion of secondary indices. Not being bucket aware means > bitcask > > has to examine all bucket/key pairs to find the ones related to a given > > bucket. This isn't to say we won't address the problem but merely to > point > > out there's some engineering work required to solve the problem > correctly. > > > > innostore is moderately bucket-aware right now so I've forked it > > (http://github.com/kevsmith/innostore) and added bucket-aware key > listing. > > Based on some very basic testing I'm seeing 2.5x speed up in overall key > > listing performance compared to the official version. I'm hoping the > patch, > > or a modified form of it, will make the next release. If you can handle > inno > > being a bit slower than bitcask and slightly more difficult to set up and > > tune then this might be an option for you. > > > > I've done some basic vetting of the code but I want to emphasize this is > a > > prototype only and hasn't received anything even close to the normal > amount > > of testing we put into a release. Please keep this in mind if you decide > to > > use my forked repo. > > > > --Kevin > > On Nov 15, 2010, at 11:57 AM, Greg Steffensen wrote: > > > >> Along these lines, are there any ideas floating around about how to > speed > >> up the listing of keys in a bucket? For the bitcask backend, it seems > >> like an index of keys-by-bucket ought to be the kind of thing that could > >> be stored in the hints files to speed this up without affecting > >> performance for live reads and writes. > >> > >> Greg > >> > >> On Mon, Nov 15, 2010 at 11:46 AM, Sean Cribbs <s...@basho.com> wrote: > >> This is possible with Riak's MapReduce but you will likely have > increasing > >> difficulty as your dataset grows, because of the impact of needing to > list > >> keys in a bucket and then eliminate data points you aren't interested > in. > >> In the longer term, there will be improvements to MapReduce such that if > >> your keys are meaningful, you will be able to filter them more easily > >> (without examining the data first). You might find Kevin Smith's > overview > >> enlightening: http://www.slideshare.net/hemulen/riak-mapred-preso > >> > >> Sean Cribbs <s...@basho.com> > >> Developer Advocate > >> Basho Technologies, Inc. > >> http://basho.com/ > >> > >> On Nov 15, 2010, at 11:34 AM, Prometheus WillSurvive wrote: > >> > >>> Hi , > >>> > >>> We have a huge database (around 4 billion record - 30 TB) storing the > >>> video watch infromation ie view count , comment , favorited etc. I want > >>> to produce daily report for all videos view counts. It means I need to > >>> look 2 day , today and yesterday so subtract yesterdey view count from > >>> today view count so I can find the daliy impression. Our Fat DB team > >>> doing this a few complex queries. I would like to ask you is this > >>> possible with Riak map-reduce way . I want to make a demonstration to > >>> the team to show this .. > >>> > >>> This is the scenario. We have similar data models for other thins. This > >>> could be a start. > >>> > >>> We have 30xHP DL380 x32 Gig Ram Farm to test this scenario. > >>> > >>> Any riak map-reduce experienced member can show some idea on this.. I > >>> guess. > >>> > >>> Regards > >>> > >>> Prometheus > >>> _______________________________________________ > >>> riak-users mailing list > >>> riak-users@lists.basho.com > >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >> > >> > >> _______________________________________________ > >> riak-users mailing list > >> riak-users@lists.basho.com > >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >> > >> > >> _______________________________________________ > >> riak-users mailing list > >> riak-users@lists.basho.com > >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > > _______________________________________________ > > riak-users mailing list > > riak-users@lists.basho.com > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > -- > Sent from my mobile device > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com