RE: Mysterious JavaScript MapReduce crashes and timeouts with 0.14.2

David Mitchell Fri, 24 Jun 2011 15:22:48 -0700

Hi Mathias,

Thank you for responding, and give me things to try.

I am testing Raik using a hypervisor, so each node only has one CPU and 1.5 GB 
of RAM.  I have 222,526 total keys stored.

I used nginx to load balance the connections equally across the three nodes.  I 
paused two seconds between each query.  Then, I repeated the 208 MapReduce 
queries.  There was one timeout from riak02.  Riak02 reported that riak01 timed 
out during the map phase.  See the following URL for the sasl-error.log from 
riak02.  https://gist.github.com/1045766

Riak02 reported that riak01 had a timeout at 4:16:12 PM.  The iostats on riak01 
around the time of the crash were as follows:
Time: 04:16:07 PM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          18.36    0.00    1.62    0.00    0.00   80.02

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
hda               0.00     8.72    0.00    0.91     0.00    77.08    84.44     
0.00    0.89   0.89   0.08

The highest CPU usage were the following:
Riak01: 31 percent
Riak02: 30 percent
Riak03: 30 percent

None of the nodes seemed to be under any stress when I got a MapReduce timeout.

Correct me if I am mistaken, but if we want to use Riak for data processing via 
MapReduce, we should abandon JavaScript and opt for Erlang.  This is my first 
take away.  Second, use Riak Search rather than key filters to do range queries.

Again, thanks for your help.

David

-----Original Message-----
From: Mathias Meyer [mailto:math...@basho.com] 
Sent: Friday, June 24, 2011 3:37 PM
To: David Mitchell
Cc: riak-users@lists.basho.com
Subject: Re: Mysterious JavaScript MapReduce crashes and timeouts with 0.14.2

David,

given your description of the error behaviour and the MapReduce job shown 
below, here's an outline of what's happening in your cluster when you fire that 
job 208 times in quick succession:

* All nodes in the cluster go through their keys using the filter you supplied, 
that's 208 times of going through all 75k keys.
* Listing keys (which key filters still technically do) is a blocking 
operation, blocking the vnode responsible for a particular partition, and 
therefore, set of keys. It's an expensive operation, and again, you're running 
hundreds of jobs in quick succession, enforcing 208 key listings in a very 
short period of time.
* Then 208 MapReduce jobs have to be spread across the cluster to the 
respective nodes having the data, each job is queued to be run by one of the 
JavaScript VM processes.
* Then all nodes with keys matching your criteria load all the objects matching 
the key filter from their respective disks, again, 208 times multiplied by the 
number of objects that match the criteria.
* The reduce phase is run only on one node, in your case riak01, that's 208 
times again. This is only CPU bound obviously, but requires all the data to be 
gathered on that particular node from all the jobs that just ran the map 
functions, again, 208 times.

The intention here is certainly not to excuse Riak for the behaviour it's 
showing. It's to give you an idea that you're putting a considerable load on 
your cluster. The problem is that you're also not running an SMP-enabled 
version of Erlang, therefore everything utilizes just one CPU/core, slowing 
things down even more. The process you're seeing should be beam.smp and not 
just beam. Make sure your Erlang is compiled with --enable-smp if you're on a 
multi-core/multi-CPU system.

Other things you should/could try to reduce the risk of crashing Riak:

* Instead of rapid-fire, limit the amount of concurrent MapReduce jobs.
* Ditch JavaScript, use Erlang instead. From looking at your code, the logic 
seams to be straight forward and should not be that hard to implement in 
Erlang. This is not a must, but you'll gain tremendously by not relying on the 
JavaScript VM, which involves lots of JSON serialization and call-outs to 
external libraries.
* Reduce the need for key filters, or simply avoid them entirely. If you need 
range queries or queries looking for values starting with a particular string, 
which seems to be the case here, consider using Riak Search instead.
* Spread the request more evenly across your cluster, reducing the coordination 
load on just one of the nodes. This won't help with the entire pain though, as 
it only reduces the load of the last step in your MapReduce job, so it's more 
of a general practice.

Mathias Meyer
Developer Advocate, Basho Technologies

On Freitag, 24. Juni 2011 at 20:43, David Mitchell wrote:

> I am doing 208 MapReduce jobs in rapid-fire succession using anonymous 
> JavaScript functions. I am sending the MapReduce jobs to a single node, 
> riak01. There are about 75,000 keys in the bucket.
> Erlang: R13B04
> Riak: 0.14.2
> 
> When I had my MapReduce timeout set to 120,000 ("timeout":120000), I 
> was getting mapexec_error, {error,timeout} This first timeout wrote to 
> the error log after seven seconds. The second and third wrote to the error 
> log after five seconds. The four timeout wrote the error log after eight 
> seconds. The beam process never crashed.
> 
> So, I increased the value to 30,000,000 ("timeout":30000000). In the first 
> run, all MapReduce jobs completed without error, each one taking about 1 to 3 
> seconds to complete.
> The CPU usage on riak01 was about 50 percent for all 208 jobs.
> Below is a sample output from iostat -x
> avg-cpu: %user %nice %system %iowait %steal %idle
>  51.00 0.00 5.01 0.10 0.00 43.89
> 
> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await 
> svctm %util hda 0.00 8.22 0.00 3.21 0.00 91.38 28.50 0.01 2.62 2.12 
> 0.68
> 
> In the second run, on the 53rd MapReduce job, the job was still waiting to 
> complete after 10 minutes. So, there was never a timeout, and nothing was 
> written to the error logs. However, the beam process obviously crashed. On 
> raik01, I executed the following commands:
> ./riak-admin status
> Node is not running!
> ./riak ping
> Node 'riak@10.0.60.208 (mailto:riak@10.0.60.208)' not responding to pings.
> ./riak attach
> Node is not running!
> 
> However, ps and top showed the process running.
> ps output:
> 1003 31807 1.0 8.7 172080 132584 pts/1 Rsl+ Jun22 28:53 
> /home/DMitchell/riak2/riak/rel/riak/erts-5.7.5/bin/beam -K true -A 64 
> -- -root /home/DMitchell/riak2/riak/rel/riak -progname riak -- -home 
> /home/DMitchell -- -boot 
> /home/DMitchell/riak2/riak/rel/riak/releases/0.14.2/riak -embedded 
> -config /home/DMitchell/riak2/riak/rel/riak/etc/app.config -name 
> riak@10.0.60.208 (mailto:riak@10.0.60.208) -setcookie riak -- console
> 
> top output:
>  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 31807 DMitchel 25 0 168m 129m 4360 R 99.3 8.7 30:46.08 beam
> 
> Below is a sample output from iostat -x when beam was in the crashed state:
> avg-cpu: %user %nice %system %iowait %steal %idle
>  100.00 0.00 0.00 0.00 0.00 0.00
> 
> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await 
> svctm %util hda 0.00 1.82 0.00 0.61 0.00 19.45 32.00 0.00 2.00 2.00 
> 0.12
> 
> Note the 100 percent CPU usage for the beam process. I terminated the beam 
> process with: kill -s TERM 31807. Then, I restarted riak.
> 
> There were no errors on the other two nodes, except for:
> =ERROR REPORT==== 24-Jun-2011::12:29:36 ===
> ** Node 'riak@10.0.60.208 (mailto:riak@10.0.60.208)' not responding **
> ** Removing (timedout) connection **
> 
> The MapReduce job is not that complex. I am using a key filter. The map phase 
> looks for an "LoadRange", and creates a new variable (e.g., "Load1", if there 
> is a match. The reduce phase counts the matches.
> {
>  "inputs" : {
>  "bucket" : "names-51013",
>  "key_filters" : [["starts_with", "22204-1-3"]]  },  "query" : [{  
> "map" : {  "keep" : false,  "language" : "javascript",  "arg" : null,  
> "source" : "function(value,keyData,arg){var 
> data=Riak.mapValuesJson(value)[0];if(data.LoadRange&&data.LoadRange==1) 
> return[{\"data.Load1\":1}];else if(data.LoadRange&&data.LoadRange==2) 
> return[{\"data.Load2\":1}];else if(data.LoadRange&&data.LoadRange==3) 
> return[{\"data.Load3\":1}];else if(data.LoadRange&&data.LoadRange==4) 
> return[{\"data.Load4\":1}];else if(data.LoadRange&&data.LoadRange==5) 
> return[{\"data.Load5\":1}];else if(data.LoadRange&&data.LoadRange==6) 
> return[{\"data.Load6\":1}];else if(data.LoadRange&&data.LoadRange==7) 
> return[{\"data.Load7\":1}];else if(data.LoadRange&&data.LoadRange==8) 
> return[{\"data.Load8\":1}];else if(data.LoadRange&&data.LoadRange==9) 
> return[{\"data.Load9\":1}];else if(data.LoadRange&&data.LoadRange==10) 
> return[{\"data.Load10\":1}];else return[];}"
>  }
>  }, {
>  "reduce" : {
>  "keep" : true,
>  "language" : "javascript",
>  "arg" : null,
>  "source" : "function(v){var s={};for(var i in v){for(var n in v[i]){if(n in 
> s) s[n]+=v[i][n];else s[n]=v[i][n];}} return[s];}"
>  }
>  }
>  ],
>  "timeout" : 30000000
> }
> 
> 
> The MapReduce timeout seem to be happening at different places, e.g., during 
> the map phase, during the reduce phase and during the key filtering phase 
> (#Fun<riak_kv_mapred_json.jsonify_not_found.1>,[],[]}).
> 
> See the URL below for the complete sasl-error.log right before a recent beam 
> crash.
> https://gist.github.com/1045386
> 
> Can anyone shed any light on why I am getting timeouts and crashes?
> 
> David
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com (mailto:riak-users@lists.basho.com)
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

RE: Mysterious JavaScript MapReduce crashes and timeouts with 0.14.2

Reply via email to