On 05/03/13 11:36, Daniel Iwan wrote:
What is the correct approach to make sure nodes are ready and Riak service
is fully up and running and can take request?
SInce our app depends on Riak we need to somehow make sure Riak is ready
before our services start.
G'day!
We run 256 vnodes on 4 nodes. Our average Riak startup time in our
virtualised test environment is about 60-70 seconds although it
sometimes can take longer (I've seen over 300 seconds before I moved to
faster disk!).
I watch the Riak console.log files and wait for the "Wait complete for
service riak_kv" message to appear on each node before starting our
application.
I have a nightly cron job that updates our test Riak database from
production. The cron job probably does most of what your test framework
is doing: stops the application, stops Riak, rsyncs the production
database files, starts Riak, waits for Riak to settle, starts the
application.
I use 'dsh'[http://sourceforge.net/projects/dsh/] to ssh to the four
Riak nodes to start Riak. On each Riak node I have the following script:
################################################################
#!/bin/bash
pattern=$1
file=$2
if [[ -z $pattern || -z $file ]]; then
echo "Usage: $0 <pattern> <file>"
exit 1
fi
# Wait for string and then exit
grep -q "$pattern" <( exec tail -n 0 -f $file ); kill $!
#################################################################
This script will wait for a string to appear in a file and then exit.
I run it from my cron job like this:
dsh -r ssh -c -m node1 -m node2 -m node3 -m node4 --
/usr/local/bin/waitforstring '"Wait complete for service riak_kv"'
/var/log/riak/console.log
Using the above command in your test framework allows you to wait for
your Riak nodes to start before starting your application. 'dsh' won't
exit until the commands on all four nodes have completed.
Of course, there's a potential race condition if Riak starts faster than
you can ssh and run the "waitforstring" script. If the string you're
waiting for has already appeared in the log file before you start
looking for it you'll never see it and end up waiting forever. As
always, YMMV.
An alternative is to have a loop running "riak-admin ringready" and
"riak-admin transfers" and exit when they are both showing desired
output. However, watching the log file is a nice, low impact method that
has been working great for me.
Hope this helps!
Shane.
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com