Hi Mathias, Thank you for responding, and give me things to try.
I am testing Raik using a hypervisor, so each node only has one CPU and 1.5 GB of RAM. I have 222,526 total keys stored. I used nginx to load balance the connections equally across the three nodes. I paused two seconds between each query. Then, I repeated the 208 MapReduce queries. There was one timeout from riak02. Riak02 reported that riak01 timed out during the map phase. See the following URL for the sasl-error.log from riak02. https://gist.github.com/1045766 Riak02 reported that riak01 had a timeout at 4:16:12 PM. The iostats on riak01 around the time of the crash were as follows: Time: 04:16:07 PM avg-cpu: %user %nice %system %iowait %steal %idle 18.36 0.00 1.62 0.00 0.00 80.02 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util hda 0.00 8.72 0.00 0.91 0.00 77.08 84.44 0.00 0.89 0.89 0.08 The highest CPU usage were the following: Riak01: 31 percent Riak02: 30 percent Riak03: 30 percent None of the nodes seemed to be under any stress when I got a MapReduce timeout. Correct me if I am mistaken, but if we want to use Riak for data processing via MapReduce, we should abandon JavaScript and opt for Erlang. This is my first take away. Second, use Riak Search rather than key filters to do range queries. Again, thanks for your help. David -----Original Message----- From: Mathias Meyer [mailto:math...@basho.com] Sent: Friday, June 24, 2011 3:37 PM To: David Mitchell Cc: riak-users@lists.basho.com Subject: Re: Mysterious JavaScript MapReduce crashes and timeouts with 0.14.2 David, given your description of the error behaviour and the MapReduce job shown below, here's an outline of what's happening in your cluster when you fire that job 208 times in quick succession: * All nodes in the cluster go through their keys using the filter you supplied, that's 208 times of going through all 75k keys. * Listing keys (which key filters still technically do) is a blocking operation, blocking the vnode responsible for a particular partition, and therefore, set of keys. It's an expensive operation, and again, you're running hundreds of jobs in quick succession, enforcing 208 key listings in a very short period of time. * Then 208 MapReduce jobs have to be spread across the cluster to the respective nodes having the data, each job is queued to be run by one of the JavaScript VM processes. * Then all nodes with keys matching your criteria load all the objects matching the key filter from their respective disks, again, 208 times multiplied by the number of objects that match the criteria. * The reduce phase is run only on one node, in your case riak01, that's 208 times again. This is only CPU bound obviously, but requires all the data to be gathered on that particular node from all the jobs that just ran the map functions, again, 208 times. The intention here is certainly not to excuse Riak for the behaviour it's showing. It's to give you an idea that you're putting a considerable load on your cluster. The problem is that you're also not running an SMP-enabled version of Erlang, therefore everything utilizes just one CPU/core, slowing things down even more. The process you're seeing should be beam.smp and not just beam. Make sure your Erlang is compiled with --enable-smp if you're on a multi-core/multi-CPU system. Other things you should/could try to reduce the risk of crashing Riak: * Instead of rapid-fire, limit the amount of concurrent MapReduce jobs. * Ditch JavaScript, use Erlang instead. From looking at your code, the logic seams to be straight forward and should not be that hard to implement in Erlang. This is not a must, but you'll gain tremendously by not relying on the JavaScript VM, which involves lots of JSON serialization and call-outs to external libraries. * Reduce the need for key filters, or simply avoid them entirely. If you need range queries or queries looking for values starting with a particular string, which seems to be the case here, consider using Riak Search instead. * Spread the request more evenly across your cluster, reducing the coordination load on just one of the nodes. This won't help with the entire pain though, as it only reduces the load of the last step in your MapReduce job, so it's more of a general practice. Mathias Meyer Developer Advocate, Basho Technologies On Freitag, 24. Juni 2011 at 20:43, David Mitchell wrote: > I am doing 208 MapReduce jobs in rapid-fire succession using anonymous > JavaScript functions. I am sending the MapReduce jobs to a single node, > riak01. There are about 75,000 keys in the bucket. > Erlang: R13B04 > Riak: 0.14.2 > > When I had my MapReduce timeout set to 120,000 ("timeout":120000), I > was getting mapexec_error, {error,timeout} This first timeout wrote to > the error log after seven seconds. The second and third wrote to the error > log after five seconds. The four timeout wrote the error log after eight > seconds. The beam process never crashed. > > So, I increased the value to 30,000,000 ("timeout":30000000). In the first > run, all MapReduce jobs completed without error, each one taking about 1 to 3 > seconds to complete. > The CPU usage on riak01 was about 50 percent for all 208 jobs. > Below is a sample output from iostat -x > avg-cpu: %user %nice %system %iowait %steal %idle > 51.00 0.00 5.01 0.10 0.00 43.89 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await > svctm %util hda 0.00 8.22 0.00 3.21 0.00 91.38 28.50 0.01 2.62 2.12 > 0.68 > > In the second run, on the 53rd MapReduce job, the job was still waiting to > complete after 10 minutes. So, there was never a timeout, and nothing was > written to the error logs. However, the beam process obviously crashed. On > raik01, I executed the following commands: > ./riak-admin status > Node is not running! > ./riak ping > Node 'riak@10.0.60.208 (mailto:riak@10.0.60.208)' not responding to pings. > ./riak attach > Node is not running! > > However, ps and top showed the process running. > ps output: > 1003 31807 1.0 8.7 172080 132584 pts/1 Rsl+ Jun22 28:53 > /home/DMitchell/riak2/riak/rel/riak/erts-5.7.5/bin/beam -K true -A 64 > -- -root /home/DMitchell/riak2/riak/rel/riak -progname riak -- -home > /home/DMitchell -- -boot > /home/DMitchell/riak2/riak/rel/riak/releases/0.14.2/riak -embedded > -config /home/DMitchell/riak2/riak/rel/riak/etc/app.config -name > riak@10.0.60.208 (mailto:riak@10.0.60.208) -setcookie riak -- console > > top output: > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 31807 DMitchel 25 0 168m 129m 4360 R 99.3 8.7 30:46.08 beam > > Below is a sample output from iostat -x when beam was in the crashed state: > avg-cpu: %user %nice %system %iowait %steal %idle > 100.00 0.00 0.00 0.00 0.00 0.00 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await > svctm %util hda 0.00 1.82 0.00 0.61 0.00 19.45 32.00 0.00 2.00 2.00 > 0.12 > > Note the 100 percent CPU usage for the beam process. I terminated the beam > process with: kill -s TERM 31807. Then, I restarted riak. > > There were no errors on the other two nodes, except for: > =ERROR REPORT==== 24-Jun-2011::12:29:36 === > ** Node 'riak@10.0.60.208 (mailto:riak@10.0.60.208)' not responding ** > ** Removing (timedout) connection ** > > The MapReduce job is not that complex. I am using a key filter. The map phase > looks for an "LoadRange", and creates a new variable (e.g., "Load1", if there > is a match. The reduce phase counts the matches. > { > "inputs" : { > "bucket" : "names-51013", > "key_filters" : [["starts_with", "22204-1-3"]] }, "query" : [{ > "map" : { "keep" : false, "language" : "javascript", "arg" : null, > "source" : "function(value,keyData,arg){var > data=Riak.mapValuesJson(value)[0];if(data.LoadRange&&data.LoadRange==1) > return[{\"data.Load1\":1}];else if(data.LoadRange&&data.LoadRange==2) > return[{\"data.Load2\":1}];else if(data.LoadRange&&data.LoadRange==3) > return[{\"data.Load3\":1}];else if(data.LoadRange&&data.LoadRange==4) > return[{\"data.Load4\":1}];else if(data.LoadRange&&data.LoadRange==5) > return[{\"data.Load5\":1}];else if(data.LoadRange&&data.LoadRange==6) > return[{\"data.Load6\":1}];else if(data.LoadRange&&data.LoadRange==7) > return[{\"data.Load7\":1}];else if(data.LoadRange&&data.LoadRange==8) > return[{\"data.Load8\":1}];else if(data.LoadRange&&data.LoadRange==9) > return[{\"data.Load9\":1}];else if(data.LoadRange&&data.LoadRange==10) > return[{\"data.Load10\":1}];else return[];}" > } > }, { > "reduce" : { > "keep" : true, > "language" : "javascript", > "arg" : null, > "source" : "function(v){var s={};for(var i in v){for(var n in v[i]){if(n in > s) s[n]+=v[i][n];else s[n]=v[i][n];}} return[s];}" > } > } > ], > "timeout" : 30000000 > } > > > The MapReduce timeout seem to be happening at different places, e.g., during > the map phase, during the reduce phase and during the key filtering phase > (#Fun<riak_kv_mapred_json.jsonify_not_found.1>,[],[]}). > > See the URL below for the complete sasl-error.log right before a recent beam > crash. > https://gist.github.com/1045386 > > Can anyone shed any light on why I am getting timeouts and crashes? > > David > > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com (mailto:riak-users@lists.basho.com) > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com