Hello,

since RIAK is primarily a key value store, I have to manage all
indexes on my own.
So what I do, for example, is to create a million address datasets in
RIAK, and then I create an additional dataset holding a simple index
for the addresses (index by city name for example).
That works so far, but is not perfect from my perspective.
One concern I have is, that the index itself is not distributed if
it's stored in one single key, and so cannot be processed in parallel.
So that's why I'm looking for a good way to work with distributed
indexes. Does anyone of you know a document etc that describes a best
practice?
I'm just thinking that I'm not the first person having this requirement :-)

Then, ideally, I would like to start a MapReduce job with the keys
holding the index, the following map phases would then work with the
keys/documents that came out of the first map phase.
Is it possible within a map function to lookup a dataset by its key
and pass this object to the next map phase then?
That would make things a bit easier, because then I can handle
everything within RIAK.

Thanks,
Felix

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to