On 26/09/13 14:33, jeffrey k eliasen wrote:
I'm trying to do some image processing using OpenCV. Later I'll be doing
some video processing as well. In a future project I will be using R to
do deep analysis on some data I'm collecting. In all these cases, what I
want to do is very simple with external languages but very hard with
both Erlang and Javascript.

What I want to do is simply invoke an external script on each element in
a bucket in the general case so that I can use advanced external tools
in an arbitrary manner. I was told by someone at Basho a long time ago
(about a year, which is a long time in internet years) that this could
be done by invoking scripts from Erlang, but I haven't heard back from
him since then and was hoping someone on the list could point me at an
example demonstrating this.

You'll make your life a lot easier if you invert your system.
Have your python scripts run somewhere and have them query Riak for keys and data. Use a map-reduce job to partition the keys per script. (eg. If you are running six python scripts in parallel, then you only want a sixth of all keys going to each one. So they want to MR for keys where modulo the key id == script id)

If you're looking for something to distribute your python scripts across a number of compute nodes, then there are various existing systems for it - Condor, Helios, Gearman, etc.


My own experiences of trying to use Riak as a map/reduce holistic system over bulk data did not end well. It's really not designed to be that.

-Toby

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to