Thank you Bowen.

As can be seen from the chart, the memory of existing nodes has increased since 
new nodes were added. And I stopped writing a specific table. Write throughput 
decreased by about 15%. And memory usage began to decrease.
I'm not sure if this was done by natural resolution or by reducing writing.
What is certain is that the addition of new nodes has increased the native 
memory usage of some existing nodes.

After reading the 3.x to 4.x migration guide of DataStax, it seems that more 
than 50% of disk availability is required for upgrade. This is likely to be a 
major obstacle to upgrading the cluster in operation.


Many thanks.

> 2022. 1. 10. 오후 8:53, Bowen Song <bo...@bso.ng> 작성:
> 
> Anything special about the table you stopped writing to? I'm wondering how 
> did you locate the table was the cause of the memory usage increase.
> 
> > For the latest version (3.11.11) upgrade, can the two versions coexist in 
> > the cluster for a while?
> > 
> > Can the 4.x version coexist as well?
> 
> Yes and yes. It is expected that two different versions of Cassandra will be 
> running in the same cluster at the same time while upgrading. This process is 
> often called zero downtime upgrade or rolling upgrade. You can perform such 
> upgrade from 3.11.4 to 3.11.11 or directly to 4.0.1, both are supported. 
> Surprisingly, I can't find any documentation related to this on the 
> cassandra.apache.org website (if you found it, please send me a link). Some 
> other sites have brief guides on this process, such as DataStax 
> <https://www.datastax.com/learn/whats-new-for-cassandra-4/migrating-cassandra-4x#how-the-migration-works>
>  and Instaclustr 
> <https://www.instaclustr.com/support/documentation/cassandra/cassandra-cluster-operations/cassandra-version-upgrades/>,
>  and you should always read the release notes 
> <https://github.com/apache/cassandra/blob/trunk/NEWS.txt> which includes 
> breaking changes and new features before you perform an upgrade.
> 
> 
> 
> On 10/01/2022 00:18, Eunsu Kim wrote:
>> Thank you for your response
>> 
>> Fortunately, memory usage came back down over the weekend. I removed the 
>> writing of a specific table last Friday.
>> 
>> <붙여넣은 그래픽-2.png>
>> 
>> 
>> For the latest version (3.11.11) upgrade, can the two versions coexist in 
>> the cluster for a while?
>> 
>> Can the 4.x version coexist as well?
>> 
>>> 2022. 1. 8. 오전 1:26, Jeff Jirsa <jji...@gmail.com 
>>> <mailto:jji...@gmail.com>> 작성:
>>> 
>>> 3.11.4 is a very old release, with lots of known bugs. It's possible the 
>>> memory is related to that.
>>> 
>>> If you bounce one of the old nodes, where does the memory end up? 
>>> 
>>> 
>>> On Thu, Jan 6, 2022 at 3:44 PM Eunsu Kim <eunsu.bil...@gmail.com 
>>> <mailto:eunsu.bil...@gmail.com>> wrote:
>>> 
>>> Looking at the memory usage chart, it seems that the physical memory usage 
>>> of the existing node has increased since the new node was added with 
>>> auto_bootstrap=false.
>>> 
>>> <붙여넣은 그래픽-1.png>
>>> 
>>> 
>>>> 
>>>> On Fri, Jan 7, 2022 at 1:11 AM Eunsu Kim <eunsu.bil...@gmail.com 
>>>> <mailto:eunsu.bil...@gmail.com>> wrote:
>>>> Hi,
>>>> 
>>>> I have a Cassandra cluster(3.11.4) that does heavy writing work. (14k~16k 
>>>> write throughput per second per node)
>>>> 
>>>> Nodes are physical machine in data center. Number of nodes are 30. Each 
>>>> node has three data disks mounted.
>>>> 
>>>> 
>>>> A few days ago, a QueryTimeout problem occurred due to Full GC.
>>>> So, referring to this 
>>>> blog(https://thelastpickle.com/blog/2018/04/11/gc-tuning.html 
>>>> <https://thelastpickle.com/blog/2018/04/11/gc-tuning.html>), it seemed to 
>>>> have been solved by changing the memtable_allocation_type to 
>>>> offheap_objects.
>>>> 
>>>> But today, I got an alarm saying that some nodes are using more than 90% 
>>>> of physical memory. (115GiB /125GiB)
>>>> 
>>>> Native memory usage of some nodes is gradually increasing.
>>>> 
>>>> 
>>>> 
>>>> All tables use TWCS, and TTL is 2 weeks.
>>>> 
>>>> Below is the applied jvm option.
>>>> 
>>>> -Xms31g
>>>> -Xmx31g
>>>> -XX:+UseG1GC
>>>> -XX:G1RSetUpdatingPauseTimePercent=5
>>>> -XX:MaxGCPauseMillis=500
>>>> -XX:InitiatingHeapOccupancyPercent=70
>>>> -XX:ParallelGCThreads=24
>>>> -XX:ConcGCThreads=24
>>>> …
>>>> 
>>>> 
>>>> What additional things can I try?
>>>> 
>>>> I am looking forward to the advice of experts.
>>>> 
>>>> Regards.
>>> 
>> 

Reply via email to