Hi Mathias,

Thank you for your thoughtful replies.

While Erlang code looks a bit odd to me, it probably would not be too difficult 
to write a few jobs in Erlang.

Sometimes, I do not see any timeouts.  Other times, I see one or two timeouts 
out of 100 MapReduce jobs.  The biggest problem is that sometimes the beam 
process will terminate due to timeout errors.  This type of error is 
unrecoverable.  If using JavaScript MapReduce makes Riak prone to crashing, 
this is a good reason to abandon JavaScript MapReduce for data processing.

I suspect that the problem of MapReduce timeouts is related to the "extra 
overhead" of using of the SpiderMonkey JavaScript engine, but I would need to 
do more testing to come to a more firm conclusion.

David


-----Original Message-----
From: Mathias Meyer [mailto:math...@basho.com] 
Sent: Wednesday, June 29, 2011 4:25 AM
To: David Mitchell
Cc: riak-users@lists.basho.com
Subject: Re: Mysterious JavaScript MapReduce crashes and timeouts with 0.14.2


On Samstag, 25. Juni 2011 at 00:23, David Mitchell wrote:

> Hi Mathias,
> 
> Thank you for responding, and give me things to try.
> 
> I am testing Raik using a hypervisor, so each node only has one CPU and 1.5 
> GB of RAM. I have 222,526 total keys stored.
>

 
I'd suggest testing your setup on something other than a hypervisor'd setup of 
virtual machines to get a good picture of performance, especially at this 
memory and machine size.


> Correct me if I am mistaken, but if we want to use Riak for data processing 
> via MapReduce, we should abandon JavaScript and opt for Erlang. This is my 
> first take away. Second, use Riak Search rather than key filters to do range 
> queries.
>

 
Considering your key filter setup, yes, you should be using Riak Search instead 
to prefilter your MapReduce inputs. As for MapReduce itself, it's well worth 
looking into Erlang, because it avoids a lot of overhead involved with 
JavaScript MapReduce. I wouldn't hold it generally true to always go for Erlang 
for more intensive MapReducing, it just depends on how often you run your 
queries and what execution speed expectations you have of them. In your case 
I'd say it'd be well worth looking into Erlang as an alternative.

Mathias Meyer
Developer Advocate, Basho Technologies




_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to