Re: Adding nodes

Bowen Song via user Mon, 11 Jul 2022 02:57:12 -0700

Checking on multiple nodes won't help if the joining node suffers fromany of the issues I described, as it will likely be flipping up and downfrequently, and the existing nodes in the cluster may never reach anagreement before the joining node stays up (or stays down) for a while.However, it will be a very strange thing if this is a persistentbehaviour. If the 'nodetool status' output on each node remainedunchanged for hours and the outputs aren't the same between nodes, itcould be an indicator of something else that had gone wrong.

Does the strange behaviour goes away after the joining node completesthe streaming and fully joins the cluster?


On 11/07/2022 10:46, Marc Hoppins wrote:


I am beginning to wonder…

If you recall, I stated that I had checked status on a bunch of othernodes from both datacentres and the joining node shows up. No errorsare occurring anywhere; data is streaming; node is joining…but, as Ialso stated, on the initial node which I only used to run the nodetoolstatus, the new node is no longer showing up. Thus the new node hasnot disappeared from the cluster, only from nodetool status on thatparticular node – which is already in the cluster, has been so forseveral weeks, and is also functioning without error.


*From:*Bowen Song via user <user@cassandra.apache.org>
*Sent:* Monday, July 11, 2022 11:40 AM
*To:* user@cassandra.apache.org
*Subject:* Re: Adding nodes

EXTERNAL

A node in joining state can disappearing from the cluster from othernodes' perspective if the joining node stops sending/receiving gossipmessages to other nodes. This can happen when the joining node isseverely overloaded, has bad network connectivity or stuck in long STWGC pauses. Regardless of the reason behind it, the state shown on thejoining node will remain as joining unless the steaming process hasfailed.

The node state is propagated between nodes via gossip, and there maybe a delay before all existing nodes agree on the fact that thejoining node is no longer in the cluster. Within that delay, differentnodes in the cluster may show different results in 'nodetool status'.

You should check the logs on the existing nodes and the joining nodeto find out why is it happening, and make appropriate changes if needed.


On 11/07/2022 09:23, Marc Hoppins wrote:

    Further oddities…

    I was sitting here watching our new new node being added (nodetool
    status being run from one of the seed nodes) and all was going
    well.  Then I noticed that our new new node was no longer
    visible.  I checked the service on the new new node and it was
    still running. So I checked status from this node and it shows in
    the status report (still UJ and streaming data), but takes a
    little longer to get the results than it did when it was visible
    from the seed.

    I checked status from a few different nodes in both datacentres
    (including other seeds) and the new new node shows up but from the
    original seed node, it does not appear in the nodetool status. Can
    anyone shed any light on this phenomena?

    *From:*Marc Hoppins <marc.hopp...@eset.com>
    <mailto:marc.hopp...@eset.com>
    *Sent:* Monday, July 11, 2022 10:02 AM
    *To:* user@cassandra.apache.org
    *Cc:* Bowen Song <bo...@bso.ng> <mailto:bo...@bso.ng>
    *Subject:* RE: Adding nodes

    Well then…

    I left this on Friday (still running) and came back to it today
    (Monday) to find the service stopped.  So, I blitzed this node
    from the ring and began anew with a different new node.

    I rather suspect the problem was with trying to use Ansible to add
    these initially - despite the fact that I had a serial limit of 1
    and a pause of 90s for starting the service on each new node
    (based on the time taken when setting up this Cassandra cluster).

    So…moving forward…

    It is recommended to only add one new node at a time from what I
    read.  This leads me to:

    Although I see the new node LOAD is progressing far faster than
    the previous failure, it is still going to take several hours to
    move from UJ to UN, which means I’ll be at this all week for the
    12 new nodes. If our LOAD per node is around 400-600GB, is there
    any practical method to speed up adding multiple new nodes which
    is unlikely to cause problems?  After all, in the modern world of
    big (how big is big?) data, 600G per node is far less than the
    real BIG big-data.

    Marc

    *From:*Jeff Jirsa <jji...@gmail.com>
    *Sent:* Friday, July 8, 2022 5:46 PM
    *To:* cassandra <user@cassandra.apache.org>
    *Cc:* Bowen Song <bo...@bso.ng>
    *Subject:* Re: Adding nodes

    EXTERNAL

    Having a node UJ but not sending/receiving other streams is an
    invalid state (unless 4.0 moved the streaming data out of
    netstats? I'm not 100% sure, but I'm 99% sure it should be there).

    It likely stopped the bootstrap process long ago with an error
    (which you may not have seen), and is running without being in the
    ring, but also not trying to join the ring.

    145GB vs 1.1T could be bits vs bytes (that's a factor of 8), or it
    could be that you streamed data and compacted it away. Hard to
    say, but less important - the fact that it's UJ but not streaming
    means there's a different problem.

    If it's me, I do this (not guaranteed to work, your mileage may
    vary, etc):

    1) Look for errors in the logs of ALL hosts. In the joining host,
    look for an exception that stops bootstrap. In the others, look
    for messages about errors streaming, and/or exceptions around file
    access. In all of those hosts, check to see if any of them think
    they're streaming ( nodetool netstats again)

    2) Stop the joining host. It's almost certainly not going to
    finish now. Remove data directories, commitlog directory, saved
    caches, hints. Wait 2 minutes. Make sure every other host in the
    cluster sees it disappear from the ring. Then, start it fresh and
    let it bootstrap again. (you could alternatively try the resumable
    bootstrap option, but I never use it).

    On Fri, Jul 8, 2022 at 2:56 AM Marc Hoppins
    <marc.hopp...@eset.com> wrote:

        Ifconfig shows RX of 1.1T. This doesn't seem to fit with the
        LOAD of 145GiB (nodetool status), unless I am reading that
        wrong...and the fact that this node still has a status of UJ.

        Netstats on this node shows (other than :
        Read Repair Statistics:
        Attempted: 0
        Mismatch (Blocking): 0
        Mismatch (Background): 0
        Pool Name                    Active   Pending Completed   Dropped
        Large messages                  n/a         0       0         0
        Small messages                  n/a        53 569755545  15740262
        Gossip messages                 n/a         0  288878         2
        None of this addresses the issue of not being able to add more
        nodes.

        -----Original Message-----
        From: Bowen Song via user <user@cassandra.apache.org>
        Sent: Friday, July 8, 2022 11:47 AM
        To: user@cassandra.apache.org
        Subject: Re: Adding nodes

        EXTERNAL


        I would assume that's 85 GB (i.e. gigabytes) then. Which is
        approximately 79 GiB (i.e. gibibytes). This still sounds
        awfully slow - less than 1MB/s over a full day (24 hours).

        You said CPU and network aren't the bottleneck. Have you
        checked the disk IO? Also, be mindful with CPU usage. It can
        still be a bottleneck if one thread uses 100% of a CPU core
        while all other cores are idle.

        On 08/07/2022 07:09, Marc Hoppins wrote:
        > Thank you for pointing that out.
        >
        > 85 gigabytes/gibibytes/GIGABYTES/GIBIBYTES/whatever name you
        care to
        > give it
        >
        > CPU and bandwidth are not the problem.
        >
        > Version 4.0.3 but, as I stated, all nodes use the same
        version so the version is not important either.
        >
        > Existing nodes have 350-400+(choose whatever you want to call a
        > gigabyte)
        >
        > The problem appears to be that adding new nodes is a serial
        process, which is fine when there is no data and each node is
        added within 2minutes.  It is hardly practical in production.
        >
        > -----Original Message-----
        > From: Bowen Song via user <user@cassandra.apache.org>
        > Sent: Thursday, July 7, 2022 8:43 PM
        > To: user@cassandra.apache.org
        > Subject: Re: Adding nodes
        >
        > EXTERNAL
        >
        >
        > 86Gb (that's gigabits, which is 10.75GB, gigabytes) took an
        entire day seems obviously too long. I would check the network
        bandwidth, disk IO and CPU usage and find out what is the
        bottleneck.
        >
        > On 07/07/2022 15:48, Marc Hoppins wrote:
        >> Hi all,
        >>
        >> Cluster of 2 DC and 24 nodes
        >>
        >> DC1 (RF3) = 12 nodes, 16 tokens each
        >> DC2 (RF3) = 12 nodes, 16 tokens each
        >>
        >> Adding 12 more nodes to DC1: I installed Cassandra (version
        is the same across all nodes) but, after the first node added,
        I couldn't seem to add any further nodes.
        >>
        >> I check nodetool status and the newly added node is UJ. It
        remains this way all day and only 86Gb of data is added to the
        node over the entire day (probably not yet complete).  This
        seems a little slow and, more than a little inconvenient to
        only be able to add one node at a time - or at least one node
        every 2 minutes.  When the cluster was created, I timed each
        node from service start to status UJ (having a UUID) and it
        was around 120 seconds.  Of course there was no data.
        >>
        >> Is it possible I have some setting not correctly tuned?
        >>
        >> Thanks
        >>
        >> Marc

Re: Adding nodes

Reply via email to