Just food for thought. Elevated read requests won’t result in escalating pending compactions, except in the corner case where the reads trigger additional write work, like for a repair or lurking tombstones deemed droppable. For a sustained growth in pending compactions, that’s not looking like random tripping over corner cases. All an elevated read request rate would do, if it weren’t for an increasing number of sstables, is cause you to churn the chunk cache. Reads would be slower due to the cache misses but the memory footprint wouldn’t be that different.
From: "Steinmaurer, Thomas" <thomas.steinmau...@dynatrace.com> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org> Date: Wednesday, November 6, 2019 at 2:43 PM To: "user@cassandra.apache.org" <user@cassandra.apache.org> Subject: RE: Cassandra 3.0.18 went OOM several hours after joining a cluster Message from External Sender Reid, thanks for thoughts. I agree with your last comment and I’m pretty sure/convinced that the increasing number of SSTables is causing the issue, although I’m not sure if compaction or read requests (after the node flipped from UJ to UN) or both, but I tend more towards client read requests resulting in accessing a high number of SSTables which basically results in ~ 2Mbyte on-heap usage per BigTableReader instance, with ~ 5K such object instances on the heap. The big question for us is why this starts to pop-up with Cas 3.0 without seeing this with 2.1 in > 3 years production usage. To avoid double work, I will try to continue providing additional information / thoughts on the Cassandra ticket. Regards, Thomas From: Reid Pinchback <rpinchb...@tripadvisor.com> Sent: Mittwoch, 06. November 2019 18:28 To: user@cassandra.apache.org Subject: Re: Cassandra 3.0.18 went OOM several hours after joining a cluster The other thing that comes to mind is that the increase in pending compactions suggests back pressure on compaction activity. GC is only one possible source of that. Between your throughput setting and how your disk I/O is set up, maybe that’s throttling you to a rate where the rate of added reasons for compactions > the rate of compactions completed. In fact, the more that I think about it, I wonder about that a lot. If you can’t keep up with compactions, then operations have to span more and more SSTables over time. You’ll keep holding on to what you read, as you read more of them, until eventually…pop. From: Reid Pinchback <rpinchb...@tripadvisor.com<mailto:rpinchb...@tripadvisor.com>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Date: Wednesday, November 6, 2019 at 12:11 PM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Subject: Re: Cassandra 3.0.18 went OOM several hours after joining a cluster Message from External Sender My first thought was that you were running into the merkle tree depth problem, but the details on the ticket don’t seem to confirm that. It does look like eden is too small. C* lives in Java’s GC pain point, a lot of medium-lifetime objects. If you haven’t already done so, you’ll want to configure as many things to be off-heap as you can, but I’d definitely look at improving the ratio of eden to old gen, and see if you can get the young gen GC activity to be more successful at sweeping away the medium-lived objects. All that really comes to mind is if you’re getting to a point where GC isn’t coping. That can be hard to sometimes spot on metrics with coarse granularity. Per-second metrics might show CPU cores getting pegged. I’m not sure that GC tuning eliminates this problem, but if it isn’t being caused by that, GC tuning may at least improve the visibility of the underlying problem. From: "Steinmaurer, Thomas" <thomas.steinmau...@dynatrace.com<mailto:thomas.steinmau...@dynatrace.com>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Date: Wednesday, November 6, 2019 at 11:27 AM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Subject: Cassandra 3.0.18 went OOM several hours after joining a cluster Message from External Sender Hello, after moving from 2.1.18 to 3.0.18, we are facing OOM situations after several hours a node has successfully joined a cluster (via auto-bootstrap). I have created the following ticket trying to describe the situation, including hprof / MAT screens: https://issues.apache.org/jira/browse/CASSANDRA-15400<https://urldefense.proofpoint.com/v2/url?u=https-3A__nam02.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Furldefense.proofpoint.com-252Fv2-252Furl-253Fu-253Dhttps-2D3A-5F-5Fissues.apache.org-5Fjira-5Fbrowse-5FCASSANDRA-2D2D15400-2526d-253DDwMF-2Dg-2526c-253D9Hv6XPedRSA-2D5PSECC38X80c1h60-5FXWA4z1k-5FR1pROA-2526r-253DOIgB3poYhzp3-5FA7WgD7iBCnsJaYmspOa2okNpf6uqWc-2526m-253DlnQdpMrbVjmjj-5Faf9BwSn1ftI8H2uSyvAya3887aDLk-2526s-253DBEeQbrRZS6Z1i25NSdwRmQVpQ36AvSNz-5Fi8Y9ks5UmA-2526e-253D-26data-3D02-257C01-257Cthomas.steinmaurer-2540dynatrace.com-257C8d53c19106b84b0e4fef08d762dfaad4-257C70ebe3a35b30435d9d677716d74ca190-257C1-257C0-257C637086585097094534-26sdata-3DBMfphm5RaKTpKXwQxLCoL5ePfe9hQg9pHnNAp5e27xQ-253D-26reserved-3D0&d=DwMGaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=md7Mvx-dtRFZI3lVLqI5lQWGcfEFYmADYUx_i_NNgAo&s=zzUonAjLEZ_AuYsQ8sKOrrWin5oRw1R9yd5vvOI4RXU&e=> Would be great if someone could have a look. Thanks a lot. Thomas The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a company registered in Linz whose registered office is at 4040 Linz, Austria, Freistädterstraße 313 The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a company registered in Linz whose registered office is at 4040 Linz, Austria, Freistädterstraße 313