Re: Best approach to prepare to shutdown a cassandra node

Javier Canillas Fri, 13 Oct 2017 11:42:49 -0700

As far as I know, the nodetool stopdaemon is doing a "kill -9".

Or did it change?


2017-10-12 23:49 GMT-03:00 Anshu Vajpayee <anshu.vajpa...@gmail.com>:

> Why are you killing when we have nodetool stopdaemon ?
>
> On Fri, Oct 13, 2017 at 1:49 AM, Javier Canillas <
> javier.canil...@gmail.com> wrote:
>
>> That's what I thought.
>>
>> Thanks!
>>
>> 2017-10-12 14:26 GMT-03:00 Hannu Kröger <hkro...@gmail.com>:
>>
>>> Hi,
>>>
>>> Drain should be enough.  It stops accepting writes and after that
>>> cassandra can be safely shut down.
>>>
>>> Hannu
>>>
>>> On 12 October 2017 at 20:24:41, Javier Canillas (
>>> javier.canil...@gmail.com) wrote:
>>>
>>> Hello everyone,
>>>
>>> I have some time working with Cassandra, but every time I need to
>>> shutdown a node (for any reason like upgrading version or moving instance
>>> to another host) I see several errors on the client applications (yes, I'm
>>> using the official java driver).
>>>
>>> By the way, I'm starting C* as a stand-alone process
>>> <https://docs.datastax.com/en/cassandra/3.0/cassandra/initialize/referenceStartCprocess.html?hl=start>,
>>> and C* version is 3.11.0.
>>>
>>> The way I have implemented the shutdown process is something like the
>>> following:
>>>
>>> *# Drain all information from commitlog into sstables*
>>>
>>> *bin/nodetool drain*
>>>
>>>
>>> *cassandra_pid=`ps -ef|grep "java.*apache-cassandra"|grep -v "grep"|awk
>>> '{print $2}'`*
>>> *if [ ! -z "$cassandra_pid" ] && [ "$cassandra_pid" -ne "1" ]; then*
>>> *        echo "Asking Cassandra to shutdown (nodetool drain doesn't stop
>>> cassandra)"*
>>> *        kill $cassandra_pid*
>>>
>>> *        echo -n "+ Checking it is down. "*
>>> *        counter=10*
>>> *        while [ "$counter" -ne 0 -a ! kill -0 $cassandra_pid >
>>> /dev/null 2>&1 ]*
>>> *        do*
>>> *                echo -n ". "*
>>> *                ((counter--))*
>>> *                sleep 1s*
>>> *        done*
>>> *        echo ""*
>>> *        if ! kill -0 $cassandra_pid > /dev/null 2>&1; then*
>>> *                echo "+ Its down."*
>>> *        else*
>>> *                echo "- Killing Cassandra."*
>>> *                kill -9 $cassandra_pid*
>>> *        fi*
>>> *else*
>>> *        echo "Care there was a problem finding Cassandra PID"*
>>> *fi*
>>>
>>> Should I add at the beginning the following lines?
>>>
>>> echo "shutdowing cassandra gracefully with: nodetool disable gossip"
>>> $CASSANDRA_HOME/$CASSANDRA_APP/bin/nodetool disablegossip
>>> echo "shutdowing cassandra gracefully with: nodetool disable binary
>>> protocol"
>>> $CASSANDRA_HOME/$CASSANDRA_APP/bin/nodetool disablebinary
>>> echo "shutdowing cassandra gracefully with: nodetool thrift"
>>> $CASSANDRA_HOME/$CASSANDRA_APP/bin/nodetool disablethrift
>>>
>>> The shutdown log is the following:
>>>
>>> *WARN  [RMI TCP Connection(10)-127.0.0.1] 2017-10-12 14:20:52,343
>>> StorageService.java:321 - Stopping gossip by operator request*
>>> *INFO  [RMI TCP Connection(10)-127.0.0.1] 2017-10-12 14:20:52,344
>>> Gossiper.java:1532 - Announcing shutdown*
>>> *INFO  [RMI TCP Connection(10)-127.0.0.1] 2017-10-12 14:20:52,355
>>> StorageService.java:2268 - Node /10.254.169.36 <http://10.254.169.36> state
>>> jump to shutdown*
>>> *INFO  [RMI TCP Connection(12)-127.0.0.1] 2017-10-12 14:20:56,141
>>> Server.java:176 - Stop listening for CQL clients*
>>> *INFO  [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:20:59,472
>>> StorageService.java:1442 - DRAINING: starting drain process*
>>> *INFO  [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:20:59,474
>>> HintsService.java:220 - Paused hints dispatch*
>>> *INFO  [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:20:59,477
>>> Gossiper.java:1532 - Announcing shutdown*
>>> *INFO  [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:20:59,480
>>> StorageService.java:2268 - Node /127.0.0.1 <http://127.0.0.1> state jump to
>>> shutdown*
>>> *INFO  [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:21:01,483
>>> MessagingService.java:984 - Waiting for messaging service to quiesce*
>>> *INFO  [ACCEPT-/192.168.6.174 <http://192.168.6.174>] 2017-10-12
>>> 14:21:01,485 MessagingService.java:1338 - MessagingService has terminated
>>> the accept() thread*
>>> *INFO  [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:21:02,095
>>> HintsService.java:220 - Paused hints dispatch*
>>> *INFO  [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:21:02,111
>>> StorageService.java:1442 - DRAINED*
>>>
>>> Disabling Gossip seemed a good idea, but watching the logs, it may use
>>> it to gracefully telling the other nodes he is going down, so I don't know
>>> if it's good or bad idea.
>>>
>>> Disabling Thrift and Binary protocol should only avoid new connections,
>>> but the one stablished and running should be attempted to finish.
>>>
>>> Any thoughts or comments?
>>>
>>> Thanks
>>>
>>> Javier.
>>>
>>>
>>>
>>
>
>
> --
> *Regards,*
> *Anshu *
>
>
>

Re: Best approach to prepare to shutdown a cassandra node

Reply via email to