Re: Multiple Index Queries using Riak and Python

Sreejith K Mon, 27 Feb 2012 22:35:42 -0800

Hi Mark,

I find this solution extremely useful in our PaaS solution where we needed
to support APIs similar to Google App Engine. Performance is
largely dependent on the number of key inputs to the MapReduce phase. But
it is quite fast when you want to get a few number of records (~1000) from
a large data set (in millions) using filters. But when the MapReduce phase
needs to fetch a large number of items, it is a little slow as supposed to
be ;-). I'll surely share our experience as we go further.


--
Regards,

Sreejith K


On Tue, Feb 28, 2012 at 2:17 AM, Mark Phillips <m...@basho.com> wrote:

> You should have sent around the blog post, too. :)
>
> http://foobarnbaz.com/2012/02/25/multi-index-queries-in-riak/
>
> Out of curiosity, are you using this in prod? Or is it just something
> you whipped up to see if it was feasible?
>
> Very cool and useful regardless. Thanks for sharing.
>
> Mark
>
> On Fri, Feb 24, 2012 at 2:05 AM, Sreejith K <sreejith...@gmail.com> wrote:
> > Hi everyone,
> >
> > I wrote a simple Python wrapper which makes use of Riak Secondary Indexes
> > and MapReduce for Qerying multiple indexes at once. I know it might not
> be
> > the ideal solution but thought It'd be a kind of cool stuff to do. Here
> is
> > how it works.
> >
> > 1. Queries Multiple Indexes and get the associated keys
> > 2. Pass the keys to a MapReduce job where Multiple filters are again
> > evaluated. The map phase applies all the conditions to individual keys
> >
> >     client = riak.RiakClient('localhost', 8091)
> >     bucket = client.bucket('test_multi_index')
> >
> >     bucket.new('sree', {'name': 'Sreejith', 'age': '25'}).\
> >         add_index('name_bin', 'Sreejith').\
> >         add_index('age_int', 25).store()
> >     bucket.new('vishnu', {'name': 'Vishnu', 'age': '31'}).\
> >         add_index('name_bin', 'Vishnu').\
> >         add_index('age_int', 31).store()
> >
> >     query = RiakMultiIndexQuery(client, 'test_multi_index')
> >     for res in query.filter('age', '<', 50).filter('name', '==',
> > 'Vishnu').run(): :
> >         print res
> >
> > But the inequality filters are based on MIN and MAX values for both bin
> and
> > int indexes (I set that to 99999999999999999 for int and '' for bin). I'm
> > not sure if that is correct or there is another way to achieve that.
> >
> > Suggestions and patches are always welcome :-)
> >
> > https://github.com/semk/utils/blob/master/riak_multi_query.py
> >
> > --
> > Regards,
> >
> > Sreejith K
> >
> > _______________________________________________
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
>

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Multiple Index Queries using Riak and Python

Reply via email to