I have a cluster of six Riak nodes that has been operating for a few
months on Amazon EC2. Because this is a development deployment with
very light usage currently, I have used cheap "micro" instances.
I am using riak_0.14.0-1_amd64.deb with no changes to the default
app.config except to modify the IP addresses.
If you look at the graph of CPU usage for these instances over the last
two weeks
http://eamonn.org/riak/riak-cluster-cpu.png
you see an interesting pattern.
Each node gradually increases its CPU over about two days and then
suddenly drops down slightly, forming a saw-tooth pattern. From an
initial low average CPU several months ago, the CPU usage has now slowly
risen. It now seems to have reached an equilibrium, with three of the
nodes at 50% and three at 60%.
Most of this usage happens when the cluster is not being used
externally. The little spikes you see on the graph are probably the
actual external access via the REST API.
I assume this activity is caused by the continuous "gossip" between the
nodes. Perhaps the different equilibrium CPU percentages are related to
which share of the data items each node has. Is the sawtooth pattern
showing some kind of garbage collection?
The Riak cluster does seem to work correctly with reasonable latency, at
least under low load. (I have not yet done load-testing).
Is this pattern expected? Is it a sign of some problem with my
configuration? Any suggestions for how to tune the cluster to run
better on EC2 micro instances? Any suggestions of what metrics to use
to decide when to dynamically scale the cluster by spinning up nodes or
spinning them down?
Thanks,
__
Eamonn O'Brien-Strain
HP Labs
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com