On Thu, Jan 13, 2011 at 03:08, Alexander Sicular <sicul...@gmail.com> wrote: > Item number one: if you are using stock riak then you are also using the > stock nval (number of replicas) of 3. This means that your 1000 k/v write is > actually 3000 items written to disk.
Reducing n_value does not improve m/r times noticeably for my test, and reduces list performance only slightly. > There are more caveats but I'll end with three. For any critically > performant system you must use the protocol buffers interface Understandable, but probably not relevant in this case since there is so little data involved. > and you must juggle connections. Do you mean pooling? If not, what do you mean? > Additionally, anonymous JavaScript functions have a > penalty associated. Does this entail pre-defining them somewhere? I can't find any documentation on this, can you point me to the relevant place? > Lastly you should also upgrade from JavaScript m/r > functions to erlang. There is performance impedance when pushing json from > the native erlang interface into the JavaScript vm. Basho claims it's insignificant, though: "There is a slight overhead when encoding the Riak object to JSON but otherwise the performance [of Erlang named functions vs. JS named functions] is comparable." [1] [1] http://blog.basho.com/2010/07/27/webinar-recap---mapreduce-querying-in-riak/ > Riak has many benefits but bleeding single node performance is not one of > them. Predictable, scaleable units of performance per node throughout a > cluster is. Unfortunately, even if additional nodes yield linear performance gains, the m/r overhead seems very large -- if I'm getting 1.5 seconds to process 1,000 items on one node, it seems apparent that I should get roughtly 1.5 seconds to process 3,000 items on 3 nodes, which still is awfully slow. Do you know how Riak compares to HBase, MongoDB or Cassandra for large dataset processing and analysis with m/r, when talking hundreds of millions, or even billions of keys? It would seem that key traversal performance would preventing Riak from competing in that space. Maybe you could do something with Riak Search, but I'm not sure if it would comparable. _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com