Maybe I am not being clear enough.

The 90/120 seconds was for NEW NODES TO A NEW CLUSTER WITH NO DATA.  Being that 
this tool/suite/application is new to both the database folk and us support 
folk and, given that we are currently using HBASE and thus can add several 
nodes at a time to a new cluster, everyone made assumptions that adding new 
nodes would be a similar function and can be added in multiples.

Now, given that I previously stated I had discovered this is NOT the case, 
please ignore the 90 seconds/120 seconds…ANY seconds.  None of this is now 
important as I am only adding ONE NEW NODE manually.  No seconds. No pauses. No 
timeouts. Just waiting.

From: Bowen Song via user <user@cassandra.apache.org>
Sent: Monday, July 11, 2022 12:13 PM
To: user@cassandra.apache.org
Subject: Re: Adding nodes

EXTERNAL

How long doe it take to add a new node? I'm 100% sure neither 90s nor 120s is 
the answer. The answer is it varies. If you want to wait for finishing adding a 
new node, be explicit about it, wait for the node fully joins the cluster. 
Don't put a fixed number of seconds in there.

You can estimate the time for adding many nodes once you've had added a node to 
the cluster. The time not only depends on the data size, hardware and network, 
but also the data in the SSTables files. For example, if a full copy of a very 
large partition exists in may SSTables files but the latest one of them is a 
tombstone, then the actual data get streamed is only the tombstone, not the 
other copies of large data.

BTW, for your own sake, you should consider automate the process to minimise 
human interactions required to add multiple nodes. It may be manageable when 
you have 5 or 10 nodes to add, but it will quickly spin off control when you 
have tons or a few hundred of them.
On 11/07/2022 10:41, Marc Hoppins wrote:
“Where did you come up with the 90 seconds number?” The database folk came up 
with THAT number. For myself, I timed adding a new node at 120 seconds for the 
initial setup with no data in the cluster.

“What exactly are you waiting for by doing that?” I wanted to see for myself 
how long it took to add a new node.  Isn’t that what RESEARCH is all about?  I 
suppose I could have just ‘googled’ it.

“Since adding nodes doesn't interfere with the client queries, the time it 
takes to add a node shouldn't be a concern at all…”  It IS a concern if one has 
to add many nodes and the ‘customers’ want some idea of how long the process 
will take.  Or, and I may be alone in this, it would be helpful to know when to 
begin adding the next new node in the ticket.  Therefore, if I know when my 
first node is finished, I will have an idea of how long before I check for the 
when subsequent nodes can be joined.

From: Bowen Song via user 
<user@cassandra.apache.org><mailto:user@cassandra.apache.org>
Sent: Monday, July 11, 2022 11:25 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Adding nodes

EXTERNAL

Sleeping/pausing for a fixed amount of time between operations at best is a 
hack to workaround an unknown issue, but it's almost always better to be 
explicit about what you are waiting for. Where did you come up with the 90 
seconds number? What exactly are you waiting for by doing that? If you want to 
wait for the node's state becomes normal (from joining), be explicit about it, 
check the nodetool output or the system.log file periodically instead of 
waiting for a fixed 90 seconds.

Streaming 600GB in a few hours sounds fairly reasonable. Since adding nodes 
doesn't interfere with the client queries, the time it takes to add a node 
shouldn't be a concern at all, as long as it's significantly faster the data 
growth rate. Just leave it running in the background, and get on with your life.

If you must speed up that process and don't care about data inconstancy or 
potencial down time, there's faster ways to do it, but doing that breaks the 
consistency and/or availability, which means it will interfere with client 
read/write operations.

A few hundred GB to a few TB per node is pretty common in Cassandra clusters. 
Big data is not about how much data on EACH node, it's about how much data in 
TOTAL.
On 11/07/2022 09:01, Marc Hoppins wrote:
Well then…

I left this on Friday (still running) and came back to it today (Monday) to 
find the service stopped.  So, I blitzed this node from the ring and began anew 
with a different new node.

I rather suspect the problem was with trying to use Ansible to add these 
initially - despite the fact that I had a serial limit of 1 and a pause of 90s 
for starting the service on each new node (based on the time taken when setting 
up this Cassandra cluster).

So…moving forward…

It is recommended to only add one new node at a time from what I read.  This 
leads me to:

Although I see the new node LOAD is progressing far faster than the previous 
failure, it is still going to take several hours to move from UJ to UN, which 
means I’ll be at this all week for the 12 new nodes. If our LOAD per node is 
around 400-600GB, is there any practical method to speed up adding multiple new 
nodes which is unlikely to cause problems?  After all, in the modern world of 
big (how big is big?) data, 600G per node is far less than the real BIG 
big-data.

Marc

From: Jeff Jirsa <jji...@gmail.com><mailto:jji...@gmail.com>
Sent: Friday, July 8, 2022 5:46 PM
To: cassandra <user@cassandra.apache.org><mailto:user@cassandra.apache.org>
Cc: Bowen Song <bo...@bso.ng><mailto:bo...@bso.ng>
Subject: Re: Adding nodes

EXTERNAL
Having a node UJ but not sending/receiving other streams is an invalid state 
(unless 4.0 moved the streaming data out of netstats? I'm not 100% sure, but 
I'm 99% sure it should be there).

It likely stopped the bootstrap process long ago with an error (which you may 
not have seen), and is running without being in the ring, but also not trying 
to join the ring.

145GB vs 1.1T could be bits vs bytes (that's a factor of 8), or it could be 
that you streamed data and compacted it away. Hard to say, but less important - 
the fact that it's UJ but not streaming means there's a different problem.

If it's me, I do this (not guaranteed to work, your mileage may vary, etc):
1) Look for errors in the logs of ALL hosts. In the joining host, look for an 
exception that stops bootstrap. In the others, look for messages about errors 
streaming, and/or exceptions around file access. In all of those hosts, check 
to see if any of them think they're streaming ( nodetool netstats again)
2) Stop the joining host. It's almost certainly not going to finish now. Remove 
data directories, commitlog directory, saved caches, hints. Wait 2 minutes. 
Make sure every other host in the cluster sees it disappear from the ring. 
Then, start it fresh and let it bootstrap again. (you could alternatively try 
the resumable bootstrap option, but I never use it).



On Fri, Jul 8, 2022 at 2:56 AM Marc Hoppins 
<marc.hopp...@eset.com<mailto:marc.hopp...@eset.com>> wrote:
Ifconfig shows RX of 1.1T. This doesn't seem to fit with the LOAD of 145GiB 
(nodetool status), unless I am reading that wrong...and the fact that this node 
still has a status of UJ.

Netstats on this node shows (other than :
Read Repair Statistics:
Attempted: 0
Mismatch (Blocking): 0
Mismatch (Background): 0
Pool Name                    Active   Pending      Completed   Dropped
Large messages                  n/a         0              0         0
Small messages                  n/a        53      569755545  15740262
Gossip messages                 n/a         0         288878         2
None of this addresses the issue of not being able to add more nodes.

-----Original Message-----
From: Bowen Song via user 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Sent: Friday, July 8, 2022 11:47 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Adding nodes

EXTERNAL


I would assume that's 85 GB (i.e. gigabytes) then. Which is approximately 79 
GiB (i.e. gibibytes). This still sounds awfully slow - less than 1MB/s over a 
full day (24 hours).

You said CPU and network aren't the bottleneck. Have you checked the disk IO? 
Also, be mindful with CPU usage. It can still be a bottleneck if one thread 
uses 100% of a CPU core while all other cores are idle.

On 08/07/2022 07:09, Marc Hoppins wrote:
> Thank you for pointing that out.
>
> 85 gigabytes/gibibytes/GIGABYTES/GIBIBYTES/whatever name you care to
> give it
>
> CPU and bandwidth are not the problem.
>
> Version 4.0.3 but, as I stated, all nodes use the same version so the version 
> is not important either.
>
> Existing nodes have 350-400+(choose whatever you want to call a
> gigabyte)
>
> The problem appears to be that adding new nodes is a serial process, which is 
> fine when there is no data and each node is added within 2minutes.  It is 
> hardly practical in production.
>
> -----Original Message-----
> From: Bowen Song via user 
> <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
> Sent: Thursday, July 7, 2022 8:43 PM
> To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
> Subject: Re: Adding nodes
>
> EXTERNAL
>
>
> 86Gb (that's gigabits, which is 10.75GB, gigabytes) took an entire day seems 
> obviously too long. I would check the network bandwidth, disk IO and CPU 
> usage and find out what is the bottleneck.
>
> On 07/07/2022 15:48, Marc Hoppins wrote:
>> Hi all,
>>
>> Cluster of 2 DC and 24 nodes
>>
>> DC1 (RF3) = 12 nodes, 16 tokens each
>> DC2 (RF3) = 12 nodes, 16 tokens each
>>
>> Adding 12 more nodes to DC1: I installed Cassandra (version is the same 
>> across all nodes) but, after the first node added, I couldn't seem to add 
>> any further nodes.
>>
>> I check nodetool status and the newly added node is UJ. It remains this way 
>> all day and only 86Gb of data is added to the node over the entire day 
>> (probably not yet complete).  This seems a little slow and, more than a 
>> little inconvenient to only be able to add one node at a time - or at least 
>> one node every 2 minutes.  When the cluster was created, I timed each node 
>> from service start to status UJ (having a UUID) and it was around 120 
>> seconds.  Of course there was no data.
>>
>> Is it possible I have some setting not correctly tuned?
>>
>> Thanks
>>
>> Marc

Reply via email to