Good suggestion. Erlang only has 39 ports open. I initially tried increasing max_open_files to a huge number just to see what would happen, but there was no noticeable difference in performance. Also lsof -u riak shows over 12000 fds open even after limiting max_open_files back to 20 and the number continues to grow until it hits the system limit which then causes that "Accept failed error". I'm ok with increasing the max_open_files limit but it doesn't really seem to improve performance at least in my case. Also even when I try to limit the amount of file descriptors with max_open_files, riak still opens new ones until it crashes.
On Sep 26, 2011, at 2:54 PM, Jon Meredith wrote: > Hi Patrick, > > I suggested increasing ports as you had an emfile on a socket accept call. > Erlang uses ports for things like network sockets and file handles when > opened by *erlang* processes. However, the leveldb library manages it's own > sockets as it is a C++ library dynamically loaded by the emulator and so > doesn't count towards ports. > > Is it possible that Riak started getting more client load if request latency > increased? Changing max_open_files will keep the number of process-level > file handles lower, but will cause more opening and closing of files to > search them. If you have a nice modern OS with lots of file handles > available, you may be able to increase max_open_files for increased > performance. > > If you want to check how many ports you are using you can run this from the > riak console. > > (dev1@127.0.0.1)7> length(erlang:ports()). > 39 > > Try increasing your max_open_ports and checking how many file handles are in > use using a tool like lsof and check the number of ports you have opened. > > Cheers, Jon. > > On Mon, Sep 26, 2011 at 12:44 PM, Patrick Van Stee <vans...@highgroove.com> > wrote: > Thanks for the quick response Jon. I bumped it from 4096 up to the max I have > set in /etc/riak/defaults and writes actually slowed down a little bit (~10 > less writes per second). Shouldn't the max_open_files setting keep the total > amount of fd's pretty low? Maybe I'm misunderstanding what that option is > used for. > > Patrick > > On Sep 26, 2011, at 2:34 PM, Jon Meredith wrote: > >> Hi Patrick, >> >> You may be running out of ports which erlang uses for TCP sockets - try >> increasing ERL_MAX_PORTS in etc/vm.args >> >> Cheers, Jon >> Basho Technologies. >> >> On Mon, Sep 26, 2011 at 12:17 PM, Patrick Van Stee <vans...@highgroove.com> >> wrote: >> We're running a small, 2 node riak cluster (on 2 m1.large boxes) using the >> LevelDB backend and have been trying to write ~250 keys a second at it. With >> a small dataset everything was running smoothly. However, after storing >> several hundred thousand keys some performance issues started to show up. >> >> * We're running out of file descriptors which is causing nodes to crash with >> the following error: >> >> 2011-09-24 00:23:52.097 [error] <0.110.0> CRASH REPORT Process [] with 0 >> neighbours crashed with reason: {error,accept_failed} >> 2011-09-24 00:23:52.098 [error] <0.121.0> application: mochiweb, "Accept >> failed error", "{error,emfile} >> >> Setting the max_open_files limit in the app.config doesn't seem to help. >> >> * Writes have slowed down by an order of magnitude. I even set the n_val, w, >> and dw bucket properties to 1 without any noticeable difference. Also we >> switched to using protocol buffers to make sure there wasn't any extra >> overhead when using HTTP. >> >> * Running map reduce jobs that use a range query on a secondary index >> started returning an error, {"error":"map_reduce_error"}, once our dataset >> increased in size. Feeding a list of keys works fine, but querying the index >> for keys seems to be timing out: >> >> 2011-09-26 16:37:57.192 [error] <0.136.0> Supervisor riak_pipe_fitting_sup >> had child undefined started with {riak_pipe_fitting,start_link,undefined} at >> <0.3497.0> exit with reason >> {timeout,{gen_server,call,[{riak_pipe_vnode_master,'riak@10.206.105.52'},{return_vnode,{'riak_vnode_req_v1',502391187832497878132516661246222288006726811648,{raw,#Ref<0.0.1.88700>,<0.3500.0>},{cmd_enqueue,{fitting,<0.3499.0>,#Ref<0.0.1.88700>,#Fun<riak_kv_mrc_pipe.0.133305895>,#Fun<riak_kv_mrc_pipe.1.125635227>},{<<"ip_queries">>,<<"uaukXZn5rZQ0LrSED3pi-fE-JjU">>},infinity,[{502391187832497878132516661246222288006726811648,'riak@10.206.105.52'}]}}}]}} >> in context child_terminated >> >> Is anyone familiar with these problems or is there anything else I can try >> to increase the performance when using LevelDB? >> _______________________________________________ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> > >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com