On 02/08/13 13:13, Jeremy Ong wrote:
What erlang version did you build with? How are you load balancing
between the nodes? What kind of disks are you using?


I don't think load-balancing or poor disks could cause performance to drop down to that 1/second rate.

I mean, even if you're using a single consumer SATA disk, and running up a bunch of virtual machines on a laptop, with no load balancing at all, then you still get much faster performance than 1/s.

T

On Thu, Aug 1, 2013 at 7:53 PM, Paul Ingalls <p...@fanzo.me> wrote:
FYI, 2 more nodes died with the end of the last test.  Storm, which I'm
using to put data in, kills the topology a bit abruptly, perhaps the nodes
don't like a client going away like that?

log from one of the nodes:

2013-08-02 02:27:23 =ERROR REPORT====
Error in process <0.4959.0> on node 'riak@riak004' with exit value:
{badarg,[{riak_core_stat,vnodeq_len,1,[{file,"src/riak_core_stat.erl"},{line,181}]},{riak_core_stat,'-vnodeq_stats/0-lc$^0/1-0-',1,[{file,"src/riak_core_stat.erl"},{line,172}]},{riak_core_stat,'-vnodeq_stats/0-lc$^0/1-0-',1,[...

2013-08-02 02:27:33 =ERROR REPORT====
Error in process <0.5055.0> on node 'riak@riak004' with exit value:
{badarg,[{riak_core_stat,vnodeq_len,1,[{file,"src/riak_core_stat.erl"},{line,181}]},{riak_core_stat,'-vnodeq_stats/0-lc$^0/1-0-',1,[{file,"src/riak_core_stat.erl"},{line,172}]},{riak_core_stat,'-vnodeq_stats/0-lc$^0/1-0-',1,[...

2013-08-02 02:27:51 =ERROR REPORT====
Error in process <0.5228.0> on node 'riak@riak004' with exit value:
{badarg,[{riak_core_stat,vnodeq_len,1,[{file,"src/riak_core_stat.erl"},{line,181}]},{riak_core_stat,'-vnodeq_stats/0-lc$^0/1-0-',1,[{file,"src/riak_core_stat.erl"},{line,172}]},{riak_core_stat,'-vnodeq_stats/0-lc$^0/1-0-',1,[...

and the log from the other node:

2013-08-02 00:09:39 =ERROR REPORT====
Error in process <0.4952.0> on node 'riak@riak007' with exit value:
{badarg,[{riak_core_stat,vnodeq_len,1,[{file,"src/riak_core_stat.erl"},{line,181}]},{riak_core_stat,'-vnodeq_stats/0-lc$^0/1-0-',1,[{file,"src/riak_core_stat.erl"},{line,172}]},{riak_core_stat,'-vnodeq_stats/0-lc$^0/1-0-',1,[...

2013-08-02 00:09:44 =ERROR REPORT====
** State machine <0.2368.0> terminating
** Last event in was unregistered
** When State == active
**      Data  ==
{state,114179815416476790484662877555959610910619729920,riak_kv_vnode,{deleted,{state,114179815416476790484662877555959610910619729920,riak_kv_eleveldb_backend,{state,<<>>,"/mnt/datadrive/riak/data/leveldb/114179815416476790484662877555959610910619729920",[{create_if_missing,true},{max_open_files,128},{use_bloomfilter,true},{write_buffer_size,58858594}],[{add_paths,[]},{allow_strfun,false},{anti_entropy,{on,[]}},{anti_entropy_build_limit,{1,3600000}},{anti_entropy_concurrency,2},{anti_entropy_data_dir,"/mnt/datadrive/riak/data/anti_entropy"},{anti_entropy_expire,604800000},{anti_entropy_leveldb_opts,[{write_buffer_size,4194304},{max_open_files,20}]},{anti_entropy_tick,15000},{create_if_missing,true},{data_root,"/mnt/datadrive/riak/data/leveldb"},{fsm_limit,50000},{hook_js_vm_count,2},{http_url_encoding,on},{included_applications,[]},{js_max_vm_mem,8},{js_thread_stack,16},{legacy_stats,true},{listkeys_backpressure,true},{map_js_vm_count,8},{mapred_2i_pipe,true},{mapred_name
,"mapred"
},{max_open_files,128},{object_format,v1},{reduce_js_vm_count,6},{stats_urlpath,"stats"},{storage_backend,riak_kv_eleveldb_backend},{use_bloomfilter,true},{vnode_vclocks,true},{write_buffer_size,58858594}],[],[],[{fill_cache,false}],true,false},{dict,0,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},undefined,3000,1000,100,100,true,true,undefined}},riak@riak003,none,undefined,undefined,undefined,{pool,riak_kv_worker,10,[]},undefined,107615}
** Reason for termination =
**
{badarg,[{eleveldb,close,[<<>>],[]},{riak_kv_eleveldb_backend,stop,1,[{file,"src/riak_kv_eleveldb_backend.erl"},{line,149}]},{riak_kv_vnode,terminate,2,[{file,"src/riak_kv_vnode.erl"},{line,836}]},{riak_core_vnode,terminate,3,[{file,"src/riak_core_vnode.erl"},{line,847}]},{gen_fsm,terminate,7,[{file,"gen_fsm.erl"},{line,586}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
2013-08-02 00:09:44 =CRASH REPORT====
   crasher:
     initial call: riak_core_vnode:init/1
     pid: <0.2368.0>
     registered_name: []
     exception exit:
{{badarg,[{eleveldb,close,[<<>>],[]},{riak_kv_eleveldb_backend,stop,1,[{file,"src/riak_kv_eleveldb_backend.erl"},{line,149}]},{riak_kv_vnode,terminate,2,[{file,"src/riak_kv_vnode.erl"},{line,836}]},{riak_core_vnode,terminate,3,[{file,"src/riak_core_vnode.erl"},{line,847}]},{gen_fsm,terminate,7,[{file,"gen_fsm.erl"},{line,586}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]},[{gen_fsm,terminate,7,[{file,"gen_fsm.erl"},{line,589}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
     ancestors: [riak_core_vnode_sup,riak_core_sup,<0.139.0>]
     messages: []
     links: [<0.142.0>]
     dictionary: [{random_seed,{8115,23258,22987}}]
     trap_exit: true
     status: running
     heap_size: 196418
     stack_size: 24
     reductions: 12124
   neighbours:
2013-08-02 00:09:44 =SUPERVISOR REPORT====
      Supervisor: {local,riak_core_vnode_sup}
      Context:    child_terminated
      Reason:
{badarg,[{eleveldb,close,[<<>>],[]},{riak_kv_eleveldb_backend,stop,1,[{file,"src/riak_kv_eleveldb_backend.erl"},{line,149}]},{riak_kv_vnode,terminate,2,[{file,"src/riak_kv_vnode.erl"},{line,836}]},{riak_core_vnode,terminate,3,[{file,"src/riak_core_vnode.erl"},{line,847}]},{gen_fsm,terminate,7,[{file,"gen_fsm.erl"},{line,586}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
      Offender:
[{pid,<0.2368.0>},{name,undefined},{mfargs,{riak_core_vnode,start_link,undefined}},{restart_type,temporary},{shutdown,300000},{child_type,worker}]



Paul Ingalls
Founder & CEO Fanzo
p...@fanzo.me
@paulingalls
http://www.linkedin.com/in/paulingalls



On Aug 1, 2013, at 7:49 PM, Paul Ingalls <p...@fanzo.me> wrote:

I should say that I build riak from the master branch on the git repository.
Perhaps that was a bad idea?

Paul Ingalls
Founder & CEO Fanzo
p...@fanzo.me
@paulingalls
http://www.linkedin.com/in/paulingalls



On Aug 1, 2013, at 7:47 PM, Paul Ingalls <p...@fanzo.me> wrote:

Thanks for the quick response Matthew!

I gave that a shot, and if anything the performance was worse.  When I
picked 128 I ran through the calculations on this page:

http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/#Parameter-Planning

and thought that would work, but it sounds like I was quite a bit off from
what you have below.

Looking at risk control, the memory was staying pretty low, and watching top
the CPU was well in hand.  iostat had very little of the CPU in iowait,
although it was writing a lot.   I imagine, however, that this is missing a
lot of the details.

Any other ideas?  I can't imagine one get/update/put cycle per second is the
best I can do…

Thanks!

Paul Ingalls
Founder & CEO Fanzo
p...@fanzo.me
@paulingalls
http://www.linkedin.com/in/paulingalls



On Aug 1, 2013, at 7:12 PM, Matthew Von-Maszewski <matth...@basho.com>
wrote:

Try cutting your max open files in half.  I am working from my iPad not my
workstation so my numbers are rough.  Will get better ones to you in the
morning.

The math goes like this:

- vnode/partition heap usage is (4Mbytes * (max_open_files -10)) + 8Mbyte
- you have 18 vnodes per server (multiply the above times 18)
- AAE (active anti-entropy is"on") so that adds (4Mbyte* 10 + 8Mbyte) times
18 vnodes

The three lines above give the total memory leveldb will attempt to use per
server if your dataset is large enough to fill it.

Matthew


_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to