Re: Cassandra OS Patching.

Michael Shuler Thu, 30 Jan 2020 08:10:07 -0800

That is some good info. To add just a little more, knowing what thepending security updates are for your nodes helps in knowing what to doafter. Read the security update notes from your vendor.

Java or Cassandra update? Of course the service needs restarted -rolling upgrade and restart the `cassandra` service as usual.

Linux kernel update? Node needs a full reboot, so follow a rollingreboot plan.

Other OS updates? Most can be done while not affecting Cassandra. Forinstance, an OpenSSH security update to patch some vulnerability shouldmost certainly be done as soon as possible, and the node updates can beeven be in parallel without causing any problems with the JVM orCassandra service. Most intelligent package update systems will installthe update and restart the affected service, in this hypothetical case`sshd`.


Michael

On 1/30/20 3:56 AM, Erick Ramirez wrote:

There is no need to shutdown the application because you should be ableto carry out the operating system upgraded without an outage to thedatabase particularly since you have a lot of nodes in your cluster.
Provided your cluster has sufficient capacity, you might even have theability to upgrade multiple nodes in parallel to reduce the upgradewindow. If you decide to do nodes in parallel and you fully understandthe token allocations and where the nodes are positioned in the ring ineach DC, make sure you only upgrade nodes which are at least 5 nodes"away" to the right so you know none of the nodes would have overlappingtoken ranges and they're not replicas of each other.
Other points to consider are:

  * If a node goes down (for whatever reason), I suggest you upgrade the
    OS on the node before bringing back up. It's already down so you
    might as well take advantage of it since you have so many nodes to
    upgrade.
  * Resist the urge to run nodetool decommission or nodetool removenode
    if you encounter an issue while upgrading a node. This is a common
    knee-jerk reaction which can prove costly because the cluster will
    rebalance automatically, adding more time to your upgrade window.
    Either fix the problem on the server or replace node using the
    "replace_address" flag.
  * Test, test, and test again. Familiarity with the process is your
    friend when the unexpected happens.
  * Plan ahead and rehearse your recovery method (i.e. replace the node)
    should you run into unexpected issues.
  * Stick to the plan and be prepared to implement it -- don't deviate.
    Don't spend 4 hours or more investigating why a server won't start.
  * Be decisive. Activate your recovery/remediation plan immediately.
I'm sure others will chime in with their recommendations. Let us knowhow you go as I'm sure others would be interested in hearing from yourexperience. Not a lot of shops have a deployment as large as yours soyou are in an enviable position. Good luck!
On Thu, Jan 30, 2020 at 3:45 PM Anshu Vajpayee <anshu.vajpa...@gmail.com<mailto:anshu.vajpa...@gmail.com>> wrote:
    Hi Team,
    What is the best way to patch OS of 1000 nodes Multi DC Cassandra
    cluster where we cannot suspend application traffic( we can redirect
    traffic to one DC).

    Please suggest if anyone has any best practice around it.
--*C*heers,*
    *Anshu V*
    *
    *


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Re: Cassandra OS Patching.

Reply via email to