Hello, since RIAK is primarily a key value store, I have to manage all indexes on my own. So what I do, for example, is to create a million address datasets in RIAK, and then I create an additional dataset holding a simple index for the addresses (index by city name for example). That works so far, but is not perfect from my perspective. One concern I have is, that the index itself is not distributed if it's stored in one single key, and so cannot be processed in parallel. So that's why I'm looking for a good way to work with distributed indexes. Does anyone of you know a document etc that describes a best practice? I'm just thinking that I'm not the first person having this requirement :-)
Then, ideally, I would like to start a MapReduce job with the keys holding the index, the following map phases would then work with the keys/documents that came out of the first map phase. Is it possible within a map function to lookup a dataset by its key and pass this object to the next map phase then? That would make things a bit easier, because then I can handle everything within RIAK. Thanks, Felix _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com