OS process not running. I think that the whole cluster crashes because the
other nodes suddenly experience an increased traffic, which makes them crash
as well.

This happened on 1.3.1 also, but for some days now everything seems to be
stable. I guess the main reason why this was happening and may happen again,
is because Riak is taking too much memory from the system. This is the usage
that I experience on a random machine in my cluster, when no M/R jobs are
running:

Cpu(s):  0.2%us,  0.1%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si, 
0.0%st
Mem:   7118944k total,  6619652k used,   499292k free,    11304k buffers
Swap:        0k total,        0k used,        0k free,  3173148k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
                                                                                
       
11392 riak      20   0 8917m 3.2g  40m S  2.7 46.5   2080:41 beam.smp 

When M/R jobs are running, almost the whole memory is full. I wonder if
there is a way to tell Riak to use less memory, at the cost of having slower
queries. By the way I also think my cluster is over provisioned. As stated
in the GitHub issue:

The cluster is made of 4 machines, 64 partitions, and n_val=2. Each server
has an average of 60GB of data stored. The machines are EC2 High CPU extra
large instances (c1.xlarge), as such they have:

7 GiB of memory
20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)





--
View this message in context: 
http://riak-users.197444.n3.nabble.com/Unexpected-Riak-1-3-crash-tp4027359p4027649.html
Sent from the Riak Users mailing list archive at Nabble.com.

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to