Hi, Drain should be enough. It stops accepting writes and after that cassandra can be safely shut down.
Hannu On 12 October 2017 at 20:24:41, Javier Canillas (javier.canil...@gmail.com) wrote: Hello everyone, I have some time working with Cassandra, but every time I need to shutdown a node (for any reason like upgrading version or moving instance to another host) I see several errors on the client applications (yes, I'm using the official java driver). By the way, I'm starting C* as a stand-alone process <https://docs.datastax.com/en/cassandra/3.0/cassandra/initialize/referenceStartCprocess.html?hl=start>, and C* version is 3.11.0. The way I have implemented the shutdown process is something like the following: *# Drain all information from commitlog into sstables* *bin/nodetool drain* *cassandra_pid=`ps -ef|grep "java.*apache-cassandra"|grep -v "grep"|awk '{print $2}'`* *if [ ! -z "$cassandra_pid" ] && [ "$cassandra_pid" -ne "1" ]; then* * echo "Asking Cassandra to shutdown (nodetool drain doesn't stop cassandra)"* * kill $cassandra_pid* * echo -n "+ Checking it is down. "* * counter=10* * while [ "$counter" -ne 0 -a ! kill -0 $cassandra_pid > /dev/null 2>&1 ]* * do* * echo -n ". "* * ((counter--))* * sleep 1s* * done* * echo ""* * if ! kill -0 $cassandra_pid > /dev/null 2>&1; then* * echo "+ Its down."* * else* * echo "- Killing Cassandra."* * kill -9 $cassandra_pid* * fi* *else* * echo "Care there was a problem finding Cassandra PID"* *fi* Should I add at the beginning the following lines? echo "shutdowing cassandra gracefully with: nodetool disable gossip" $CASSANDRA_HOME/$CASSANDRA_APP/bin/nodetool disablegossip echo "shutdowing cassandra gracefully with: nodetool disable binary protocol" $CASSANDRA_HOME/$CASSANDRA_APP/bin/nodetool disablebinary echo "shutdowing cassandra gracefully with: nodetool thrift" $CASSANDRA_HOME/$CASSANDRA_APP/bin/nodetool disablethrift The shutdown log is the following: *WARN [RMI TCP Connection(10)-127.0.0.1] 2017-10-12 14:20:52,343 StorageService.java:321 - Stopping gossip by operator request* *INFO [RMI TCP Connection(10)-127.0.0.1] 2017-10-12 14:20:52,344 Gossiper.java:1532 - Announcing shutdown* *INFO [RMI TCP Connection(10)-127.0.0.1] 2017-10-12 14:20:52,355 StorageService.java:2268 - Node /10.254.169.36 <http://10.254.169.36> state jump to shutdown* *INFO [RMI TCP Connection(12)-127.0.0.1] 2017-10-12 14:20:56,141 Server.java:176 - Stop listening for CQL clients* *INFO [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:20:59,472 StorageService.java:1442 - DRAINING: starting drain process* *INFO [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:20:59,474 HintsService.java:220 - Paused hints dispatch* *INFO [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:20:59,477 Gossiper.java:1532 - Announcing shutdown* *INFO [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:20:59,480 StorageService.java:2268 - Node /127.0.0.1 <http://127.0.0.1> state jump to shutdown* *INFO [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:21:01,483 MessagingService.java:984 - Waiting for messaging service to quiesce* *INFO [ACCEPT-/192.168.6.174 <http://192.168.6.174>] 2017-10-12 14:21:01,485 MessagingService.java:1338 - MessagingService has terminated the accept() thread* *INFO [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:21:02,095 HintsService.java:220 - Paused hints dispatch* *INFO [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:21:02,111 StorageService.java:1442 - DRAINED* Disabling Gossip seemed a good idea, but watching the logs, it may use it to gracefully telling the other nodes he is going down, so I don't know if it's good or bad idea. Disabling Thrift and Binary protocol should only avoid new connections, but the one stablished and running should be attempted to finish. Any thoughts or comments? Thanks Javier.