Re: Cassandra 3.0.18 went OOM several hours after joining a cluster

Reid Pinchback Wed, 06 Nov 2019 12:46:00 -0800

Just food for thought.

Elevated read requests won’t result in escalating pending compactions, except 
in the corner case where the reads trigger additional write work, like for a 
repair or lurking tombstones deemed droppable.  For a sustained growth in 
pending compactions, that’s not looking like random tripping over corner cases. 
  All an elevated read request rate would do, if it weren’t for an increasing 
number of sstables, is cause you to churn the chunk cache. Reads would be 
slower due to the cache misses but the memory footprint wouldn’t be that 
different.

From: "Steinmaurer, Thomas" <thomas.steinmau...@dynatrace.com>
Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Wednesday, November 6, 2019 at 2:43 PM
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: RE: Cassandra 3.0.18 went OOM several hours after joining a cluster

Message from External Sender
Reid,

thanks for thoughts.

I agree with your last comment and I’m pretty sure/convinced that the 
increasing number of SSTables is causing the issue, although I’m not sure if 
compaction or read requests (after the node flipped from UJ to UN) or both, but 
I tend more towards client read requests resulting in accessing a high number 
of SSTables which basically results in ~ 2Mbyte on-heap usage per 
BigTableReader instance, with ~ 5K such object instances on the heap.

The big question for us is why this starts to pop-up with Cas 3.0 without 
seeing this with 2.1 in > 3 years production usage.

To avoid double work, I will try to continue providing additional information / 
thoughts on the Cassandra ticket.

Regards,
Thomas

From: Reid Pinchback <rpinchb...@tripadvisor.com>
Sent: Mittwoch, 06. November 2019 18:28
To: user@cassandra.apache.org
Subject: Re: Cassandra 3.0.18 went OOM several hours after joining a cluster

The other thing that comes to mind is that the increase in pending compactions 
suggests back pressure on compaction activity.  GC is only one possible source 
of that.  Between your throughput setting and how your disk I/O is set up, 
maybe that’s throttling you to a rate where the rate of added reasons for 
compactions > the rate of compactions completed.

In fact, the more that I think about it, I wonder about that a lot.

If you can’t keep up with compactions, then operations have to span more and 
more SSTables over time.  You’ll keep holding on to what you read, as you read 
more of them, until eventually…pop.

From: Reid Pinchback 
<rpinchb...@tripadvisor.com<mailto:rpinchb...@tripadvisor.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Wednesday, November 6, 2019 at 12:11 PM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Cassandra 3.0.18 went OOM several hours after joining a cluster

Message from External Sender
My first thought was that you were running into the merkle tree depth problem, 
but the details on the ticket don’t seem to confirm that.

It does look like eden is too small.   C* lives in Java’s GC pain point, a lot 
of medium-lifetime objects.  If you haven’t already done so, you’ll want to 
configure as many things to be off-heap as you can, but I’d definitely look at 
improving the ratio of eden to old gen, and see if you can get the young gen GC 
activity to be more successful at sweeping away the medium-lived objects.

All that really comes to mind is if you’re getting to a point where GC isn’t 
coping.  That can be hard to sometimes spot on metrics with coarse granularity. 
 Per-second metrics might show CPU cores getting pegged.

I’m not sure that GC tuning eliminates this problem, but if it isn’t being 
caused by that, GC tuning may at least improve the visibility of the underlying 
problem.

From: "Steinmaurer, Thomas" 
<thomas.steinmau...@dynatrace.com<mailto:thomas.steinmau...@dynatrace.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Wednesday, November 6, 2019 at 11:27 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Cassandra 3.0.18 went OOM several hours after joining a cluster

Message from External Sender
Hello,

after moving from 2.1.18 to 3.0.18, we are facing OOM situations after several 
hours a node has successfully joined a cluster (via auto-bootstrap).

I have created the following ticket trying to describe the situation, including 
hprof / MAT screens: 
https://issues.apache.org/jira/browse/CASSANDRA-15400<https://urldefense.proofpoint.com/v2/url?u=https-3A__nam02.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Furldefense.proofpoint.com-252Fv2-252Furl-253Fu-253Dhttps-2D3A-5F-5Fissues.apache.org-5Fjira-5Fbrowse-5FCASSANDRA-2D2D15400-2526d-253DDwMF-2Dg-2526c-253D9Hv6XPedRSA-2D5PSECC38X80c1h60-5FXWA4z1k-5FR1pROA-2526r-253DOIgB3poYhzp3-5FA7WgD7iBCnsJaYmspOa2okNpf6uqWc-2526m-253DlnQdpMrbVjmjj-5Faf9BwSn1ftI8H2uSyvAya3887aDLk-2526s-253DBEeQbrRZS6Z1i25NSdwRmQVpQ36AvSNz-5Fi8Y9ks5UmA-2526e-253D-26data-3D02-257C01-257Cthomas.steinmaurer-2540dynatrace.com-257C8d53c19106b84b0e4fef08d762dfaad4-257C70ebe3a35b30435d9d677716d74ca190-257C1-257C0-257C637086585097094534-26sdata-3DBMfphm5RaKTpKXwQxLCoL5ePfe9hQg9pHnNAp5e27xQ-253D-26reserved-3D0&d=DwMGaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=md7Mvx-dtRFZI3lVLqI5lQWGcfEFYmADYUx_i_NNgAo&s=zzUonAjLEZ_AuYsQ8sKOrrWin5oRw1R9yd5vvOI4RXU&e=>

Would be great if someone could have a look.

Thanks a lot.

Thomas
The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freistädterstraße 313
The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freistädterstraße 313

Re: Cassandra 3.0.18 went OOM several hours after joining a cluster

Reply via email to