On 05/03/13 11:36, Daniel Iwan wrote:
What is the correct approach to make sure nodes are ready and Riak service
is fully up and running and can take request?
SInce our app depends on Riak we need to somehow make sure Riak is ready
before our services start.

G'day!

We run 256 vnodes on 4 nodes. Our average Riak startup time in our virtualised test environment is about 60-70 seconds although it sometimes can take longer (I've seen over 300 seconds before I moved to faster disk!).

I watch the Riak console.log files and wait for the "Wait complete for service riak_kv" message to appear on each node before starting our application.

I have a nightly cron job that updates our test Riak database from production. The cron job probably does most of what your test framework is doing: stops the application, stops Riak, rsyncs the production database files, starts Riak, waits for Riak to settle, starts the application.

I use 'dsh'[http://sourceforge.net/projects/dsh/] to ssh to the four Riak nodes to start Riak. On each Riak node I have the following script:

################################################################
#!/bin/bash

pattern=$1
file=$2

if [[ -z $pattern || -z $file ]]; then
  echo "Usage: $0 <pattern> <file>"
  exit 1
fi

# Wait for string and then exit
grep -q "$pattern" <( exec tail -n 0 -f $file ); kill $!
#################################################################

This script will wait for a string to appear in a file and then exit.

I run it from my cron job like this:

dsh -r ssh -c -m node1 -m node2 -m node3 -m node4 -- /usr/local/bin/waitforstring '"Wait complete for service riak_kv"' /var/log/riak/console.log

Using the above command in your test framework allows you to wait for your Riak nodes to start before starting your application. 'dsh' won't exit until the commands on all four nodes have completed.

Of course, there's a potential race condition if Riak starts faster than you can ssh and run the "waitforstring" script. If the string you're waiting for has already appeared in the log file before you start looking for it you'll never see it and end up waiting forever. As always, YMMV.

An alternative is to have a loop running "riak-admin ringready" and "riak-admin transfers" and exit when they are both showing desired output. However, watching the log file is a nice, low impact method that has been working great for me.

Hope this helps!

Shane.


_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to