We still have this problem (we're on riak 1.4.9) and it's very frustrating!
Our average object size right now is ~250k. We're running with: +zdbbl 2097151 I've tried the settings above on a 5 node test cluster, no improvement. I then bumped both buffers up to 1048576 on all nodes - no improvement. Finally I tried putting the buffers up to 4194304 - still no improvement. For the record my kernel is Ubuntu 3.13.0-27, with the following network settings: net.core.netdev_max_backlog = 10000 net.core.rmem_default = 8388608 net.core.rmem_max = 104857600 net.core.somaxconn = 4000 net.core.wmem_default = 8388608 net.core.wmem_max = 104857600 net.ipv4.tcp_congestion_control = cubic net.ipv4.tcp_fin_timeout = 15 net.ipv4.tcp_low_latency = 0 net.ipv4.tcp_max_syn_backlog = 40000 net.ipv4.tcp_slow_start_after_idle = 0 net.ipv4.tcp_tw_reuse = 1 Chris On Wed, Jun 18, 2014 at 7:32 PM, Evan Vigil-McClanahan <emcclana...@basho.com> wrote: > Hi Earl, > > There are some known internode bottlenecks in riak 1.4.x. We've > addressed some of them in 2.0, but others likely remain. If you're > willing to run some code at the console, running the following at the > console (from `riak attach`) should tell you whether or not the 2.0 > changes are likely to help you. I am not sure when 2.0 ready versions > of CS are slated for, however. > > ----- > [inet:setopts(Port, [{sndbuf, 393216}, {recbuf, 786432}]) > || {_Node, Port} <- erlang:system_info(dist_ctrl)]. > > or to run this on all nodes (which you'll have to do to see if it helps): > > FF = fun() -> > [inet:setopts(Port, [{sndbuf, 393216}, {recbuf, 786432}]) > || {_Node, Port} <- erlang:system_info(dist_ctrl)] > end. > rpc:multicall(erlang, apply, [FF, []]). > > You should not run any of this on production machines without > extensive testing first. Also if you have huge objects, like in a CS > cluster, it may help to increase the buffer sizes somewhat. > > Note that increasing +zdbbl in your vm.args can also help somewhat, if > it isn't already prohibitively large. > > Hope that this helps. Let us know what you find. > > Evan > > On Wed, Jun 18, 2014 at 4:57 PM, Earl Ruby <earl_r...@xyratex.com> wrote: >> Chris Read: >> >> Back in 2013 you reported a performance problem with Riak 1.4.2 running on a >> 10GbE network where Riak would never hit speeds faster than 2.5Gbps on the >> network. >> >> I'm seeing the same thing with Riak 1.4.2 and RiakCS. I've followed all of >> the tuning suggestions, my MTU is set to 9000 on the ethernet interfaces, I >> have one 10GbE network just for the backend inter-node data and one 10GbE >> "public" network where RiakCS listens for connections and which basho_bench >> uses to generate the load. I have 1-4 client systems on the public side >> running basho_bench and no matter how much traffic I generate with >> basho_bench I never see more than 3Gbits/s on the network. (It doesn't seem >> to matter if I run 1 or 4 clients, each with 200 concurrent sessions, the >> network data rate is about the same.) I'm running jnettop in two different >> windows during the tests to watch the aggregate network traffic on the >> private inter-node data network and the "public" basho_bench >> traffic-generating network. >> >> I've tested the network with iperf3 and it shows 9.92Gbits/s throughput with >> a TCP maximum segment size of 9000. >> >> I've tested the filesystems on each of the 6 Riak nodes using fio, and I can >> write to the filesystems at ~12.8Gbits/s, so the filesystem is not the >> bottleneck. Each node has 128GB RAM and is running the bitcask backend. The >> servers are mostly idle. >> >> I tried Sean's solution of increasing these values to: >> >> {riak_core, [ >> {handoff_batch_threshold, 4194304}, >> {handoff_concurrency, 10} ]} >> >> ... as described in >> http://lists.basho.com/pipermail/riak-users_lists.basho.com/2013-October/013787.html, >> but that had no effect. >> >> With my current hardware I'd expect that the 10GbE network would be the >> bottleneck, and I'd expect write speeds to top out at the top end of the >> network speed. >> >> There was no follow-up message on the mailing list to indicate how or if >> you'd solved the problem. Did you find a solution? >> >> (Please direct replies to the mailing list.) >> >> >> _______________________________________________ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com