Hi Anthony,

watching the memory that riak consumes is certainly the most important
metric. If riak runs out of memory you are screwed. On a dedicated node
just use the 'free' command to get that information:

          total      used       free      shared    buffers   cached
Mem:      66106100   49834164   16271936   0        232460    41492552
-/+ buffers/cache:    8109152   57996948
Swap:      1951856         36    1951820


the most important line is the second one. There should be "enough" free
memory. Also swap used should always be near zero. Whats enough is not
so clear cut. If latency is really important for you, it should be
enough free memory, so that your working set fits into that amount. Then
the disk is mostly used for writing.

The way you can eyeball if you have enough memory left for caching is
for example by using a tool like 'iostat' (in the systats package on
Debian). It shows you the disk utilization of your system. I usually
call it like that:

iostat -x 2


Of course you should also watch CPU and network utilization, but usually
disk or memory becomes a problem first.

Cheers,
Nico

Am Mittwoch, den 23.03.2011, 23:45 -0700 schrieb Anthony Molinaro:
> Hi Nico,
> 
>    Its unclear riak-admin status eventually calls riak_kv_stat:get_stats
> which states
> 
> %%</dd><dt> mem_total
> %%</dt><dd> The first element of the tuple returned by
> %%          {@link memsup:get_memory_data/0}.
> %%
> %%</dd><dt> mem_allocated
> %%</dt><dd> The second element of the tuple returned by
> %%          {@link memsup:get_memory_data/0}.
> %%
> 
> The man page for memsup states
> 
>   Returns  the  result  of the latest memory check, where Total is
>   the total memory size and Allocated the allocated  memory  size,
>   in bytes.
> 
> which doesn't tell me if that's filesystem cache or not.  According to
> top the riak beam.smp is using 3G of virtual and 2.7G of resident
> on node1 which don't match any of the values from before.  Attaching
> to the riak node and running memory() I see
> 
> [{total,2392208240},
>  {processes,15012600},
>  {processes_used,12797856},
>  {system,2377195640},
>  {atom,824297},
>  {atom_used,812024},
>  {binary,898656},
>  {code,8336856},
>  {ets,556632}]
> 
> Which seems to reflect what top claims.  I'm just curious what to look at
> to determine when I need to add new nodes.  I'm currently capturing the
> statistics riak provides and putting them into rrds, and mean response time
> is great (95,99, and 100 have spikes quite regularly which I still don't fully
> understand the cause of, but mean/median is pretty good <1ms).  But I'm
> wondering when to detect if the whole thing will come crashing down.
> 
> I've used Cassandra for the last 20 months in production and had the same
> issue, it works great then it falls over, and unfortunately with such evenly
> space data, everything tends to fall over at once.  I just don't want that
> to happen with my riak cluster, so am wondering how to tell if you are close
> to needing to grow.
> 
> Anyone have any ideas?
> 
> -Anthony
> 
> 
> On Thu, Mar 24, 2011 at 01:21:05AM +0100, Nico Meyer wrote:
> > Hi Anthony,
> > 
> > are you sure you are not including the filesystem cache in your
> > mem_allocated values? It will grow to use all of the free memory or
> > the total size of your bitcask data files, whichever is smaller.
> > 
> > We have about 100Mio keys per node, and riak uses about 7GB of RAM.
> > 
> > Cheers,
> > Nico
> > 
> > On 23.03.2011 23:24, Anthony Molinaro wrote:
> > >So a question about when to add new nodes.  I'm looking at the output of
> > >this script and the output of riak-admin status to attempt to figure out
> > >if it's time to grow a cluster.
> > >
> > >I have 4 nodes 1024 partitions replication factor 3, currently with a
> > >single bitcask single bucket where both the key and the value are 36 bytes.
> > >
> > >According to the bitcask spreadsheet the overhead per key is 40 bytes
> > >
> > >The current key counts/memory are
> > >
> > >              key_counts  mem_total  mem_allocated   (key_count*76)
> > >node1        22381785  25269010432  21015953408     1701015660
> > >node2        22378092  25269010432  14076137472     1700734992
> > >node3        22373770  25269010432  21565509632     1700406520
> > >node4        22382394  25269010432  21493731328     1701061944
> > >
> > >node2 failed at some point and was replaced with with a new node.
> > >
> > >So there is some oddness here I don't understand.  According to the
> > >calculated value I should see about 1.7GB per box used, instead I see
> > >21GB on most machines except for the one which was restarted which has
> > >14GB.  From looking at memory it seems like I should be adding some nodes
> > >real soon or amount allocated will hit the total amount.  Or maybe there's
> > >a memory leak which will reduce the amount of memory (as with node2)?
> > >
> > >I'm just trying to figure out why I seem to almost be out of memory with
> > >23 million documents when the Bitcask capacity planning spreadsheet seems
> > >to suggest I should be able to have 282 million with 20 GiB of free Ram.
> > >
> > >Confused,
> > >
> > >-Anthony
> > >
> > >On Wed, Mar 16, 2011 at 12:04:48PM -0700, Aphyr wrote:
> > >>I'm trying to track some basic metrics so we can plan for cluster
> > >>capacity, monitor transfers, etc. Figured this might be of interest
> > >>to other riak admins. Apologies if my erlang is nonidiomatic, I'm
> > >>still learning. :)
> > >>
> > >>#!/usr/bin/env escript
> > >>%%! -name riakstatuscheck -setcookie riak
> > >>
> > >>main([]) ->  main(["riak@127.0.0.1"]);
> > >>main([Node]) ->
> > >>   io:format("~w\n", [
> > >>     lists:foldl(
> > >>       fun({_VNode, Count}, Sum) ->  Sum + Count end,
> > >>       0,
> > >>       rpc:call(list_to_atom(Node), riak_kv_bitcask_backend, key_counts, 
> > >> [])
> > >>     )
> > >>   ]).
> > >>
> > >>
> > >>$ ./riakstatus riak@127.0.0.1
> > >>18729
> > >>
> > >>_______________________________________________
> > >>riak-users mailing list
> > >>riak-users@lists.basho.com
> > >>http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > >
> 



_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to