Re: Question upon gracefully restarting c* node(s)

Alain RODRIGUEZ Wed, 10 Jan 2018 04:20:19 -0800

I agree with comments above. Cassandra is robust, and we are just talking
about optimising the process. Nothing mandatory. Going to an extreme I
would say you can pull and plug back the node power cable and call it a
restart, It should not harm if your cluster is properly tuned. Yet
optimisation are welcomed as they improve entropy, starting time. Plus we
are civilized operators, not barbarians, aren't we ;-)? It's just more
'clean' and efficient.
Also, historically, it was mandatory to drain when using counter to prevent
over-count as counter are not idempotent. Not sure about this nowadays).


Last time I asked this very question I ended up building this command that
I have been using since then:

`date && nodetool disablebinary && nodetool disablegossip && sleep 10 &&
nodetool flush && nodetool drain && sleep 10 && sudo service cassandra
restart`

It does the following:

- Print the date for the record
- Stop all clients transports. I never heard about a benefice of shutting
down the gossip protocol, and so never did so, it might be better but I
can't really say. This way we stop listening for clients.
- After a small while no clients are using the node, calling the drain
flushes memtables and recycle commitlog as Kurt detailed above. Here I add
a 'flush' because I haven't been that lucky in the past with drain,
sometimes not working at all, sometimes not cleaning commitlogs. I believe
flushing first makes this restart command more robust.
- Finally restart the service.

I think there is not only one good way to do this. Also, doing it wrong is
often not such a big deal.

C*heers,
-----------------------
Alain Rodriguez - @arodream - al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com





2018-01-08 3:33 GMT+00:00 Jeff Jirsa <jji...@gmail.com>:

> The sequence does have some objective benefits - especially stopping
> transports and then gossip, it tells everything you’re going offline before
> you do, so requests won’t get dropped or have to speculate to other
> replicas.
>
>
>
> --
> Jeff Jirsa
>
>
> On Jan 7, 2018, at 7:22 PM, kurt greaves <k...@instaclustr.com> wrote:
>
> None are essential. Cassandra will gracefully shutdown in any scenario as
> long as it's not killed with a SIGKILL. However, drain does have a few
> benefits over just a normal shutdown. It will stop a few extra services
> (batchlog, compactions) and importantly it will also force recycling of
> dirty commitlog segments, meaning there will be less commitlog files to
> replay on startup and reducing startup time.
>
> A comment in the code for drain also indicates that it will wait for
> in-progress streaming to complete, but I haven't managed to find 1. where
> this occurs, or 2. if it actually differs to a normal shutdown. Note that
> this is all w.r.t 2.1. In 3.0.10 and 3.10 drain and shutdown more or less
> do the exact same thing, however drain will log some extra messages.
>
> On 2 January 2018 at 07:07, Jing Meng <self.rel...@gmail.com> wrote:
>
>> Hi all.
>>
>> Recently we made a change to our production env c* cluster (2.1.18) -
>> placing the commit log to the same SSD where data is stored, which needs
>> restarting all nodes.
>>
>> Before restarting a cassandra node, we ran the following nodetool utils:
>> $ nodetool disablethrift && sleep 5
>> $ nodetool disablebinary && sleep 5
>> $ nodetool disable gossip && sleep 5
>> $ nodetool drain && sleep 5
>>
>> It was "graceful" as expected (no significant errors found), but the
>> process is still a myth to us: are those commands used above "sufficient",
>> and/or why? The offical doc (docs.datastax.com) did not help with this
>> operation detail, though "nodetool drain" is apparently essential.
>>
>
>

Re: Question upon gracefully restarting c* node(s)

Reply via email to