Re: Question upon gracefully restarting c* node(s)

Thakrar, Jayesh Wed, 10 Jan 2018 06:21:47 -0800

Just curious - aside from the "sleep", is this all not part of the shutdown 
command?
Is this an "opportunity" to improve C*?
Having worked with RDBMSes, Hadoop and HBase, stopping communication, flushing 
memcache (HBase), and relinquishing ownership of data (HBase) is all part of 
the shutdown process.

From: Alain RODRIGUEZ <arodr...@gmail.com>
Date: Wednesday, January 10, 2018 at 6:19 AM
To: "user cassandra.apache.org" <user@cassandra.apache.org>
Subject: Re: Question upon gracefully restarting c* node(s)

I agree with comments above. Cassandra is robust, and we are just talking about 
optimising the process. Nothing mandatory. Going to an extreme I would say you 
can pull and plug back the node power cable and call it a restart, It should 
not harm if your cluster is properly tuned. Yet optimisation are welcomed as 
they improve entropy, starting time. Plus we are civilized operators, not 
barbarians, aren't we ;-)? It's just more 'clean' and efficient.
Also, historically, it was mandatory to drain when using counter to prevent 
over-count as counter are not idempotent. Not sure about this nowadays).

Last time I asked this very question I ended up building this command that I 
have been using since then:

`date && nodetool disablebinary && nodetool disablegossip && sleep 10 && 
nodetool flush && nodetool drain && sleep 10 && sudo service cassandra restart`

It does the following:

- Print the date for the record
- Stop all clients transports. I never heard about a benefice of shutting down 
the gossip protocol, and so never did so, it might be better but I can't really 
say. This way we stop listening for clients.
- After a small while no clients are using the node, calling the drain flushes 
memtables and recycle commitlog as Kurt detailed above. Here I add a 'flush' 
because I haven't been that lucky in the past with drain, sometimes not working 
at all, sometimes not cleaning commitlogs. I believe flushing first makes this 
restart command more robust.
- Finally restart the service.

I think there is not only one good way to do this. Also, doing it wrong is 
often not such a big deal.

C*heers,
-----------------------
Alain Rodriguez - @arodream - 
al...@thelastpickle.com<mailto:al...@thelastpickle.com>
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2018-01-08 3:33 GMT+00:00 Jeff Jirsa 
<jji...@gmail.com<mailto:jji...@gmail.com>>:
The sequence does have some objective benefits - especially stopping transports 
and then gossip, it tells everything you’re going offline before you do, so 
requests won’t get dropped or have to speculate to other replicas.

--
Jeff Jirsa

On Jan 7, 2018, at 7:22 PM, kurt greaves 
<k...@instaclustr.com<mailto:k...@instaclustr.com>> wrote:
None are essential. Cassandra will gracefully shutdown in any scenario as long 
as it's not killed with a SIGKILL. However, drain does have a few benefits over 
just a normal shutdown. It will stop a few extra services (batchlog, 
compactions) and importantly it will also force recycling of dirty commitlog 
segments, meaning there will be less commitlog files to replay on startup and 
reducing startup time.

A comment in the code for drain also indicates that it will wait for 
in-progress streaming to complete, but I haven't managed to find 1. where this 
occurs, or 2. if it actually differs to a normal shutdown. Note that this is 
all w.r.t 2.1. In 3.0.10 and 3.10 drain and shutdown more or less do the exact 
same thing, however drain will log some extra messages.

On 2 January 2018 at 07:07, Jing Meng 
<self.rel...@gmail.com<mailto:self.rel...@gmail.com>> wrote:
Hi all.

Recently we made a change to our production env c* cluster (2.1.18) - placing 
the commit log to the same SSD where data is stored, which needs restarting all 
nodes.

Before restarting a cassandra node, we ran the following nodetool utils:
$ nodetool disablethrift && sleep 5
$ nodetool disablebinary && sleep 5
$ nodetool disable gossip && sleep 5
$ nodetool drain && sleep 5

It was "graceful" as expected (no significant errors found), but the process is 
still a myth to us: are those commands used above "sufficient", and/or why? The 
offical doc (docs.datastax.com<http://docs.datastax.com>) did not help with 
this operation detail, though "nodetool drain" is apparently essential.

Re: Question upon gracefully restarting c* node(s)

Reply via email to