If the memory wasn’t being used, and it got pushed to swap, then the right thing happened. It’s a common misconception that swap is bad. The use of swap isn’t bad. What is bad is if you find data churning in and out of swap space a lot so that your latency increases either due to the page faults or due to contention between swap activity and other disk I/O. For the case it sounds like we’ve been discussing, where the buffers aren’t in use, basically all that would happen is that memory garbage would be shoved out of the way. Honestly the thought I’d had in mind when you first described this would be to intentionally use cgroups to twiddle swappiness so that a short-term co-tenant load could be prioritized and shove stale C* memory out of the way, then twiddle the settings back when you prefer C* to be the winner in resource demand.
From: manish khandelwal <manishkhandelwa...@gmail.com> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org> Date: Wednesday, April 22, 2020 at 7:23 AM To: "user@cassandra.apache.org" <user@cassandra.apache.org> Subject: Re: Impact of setting low value for flag -XX:MaxDirectMemorySize Message from External Sender I am running spark (max heap 4G) and a java application (4G) with my Cassandra server (8G). After heavy loading, if I run a spark process some main memory is pushed into swap. But if a restart Cassandra and execute the spark process memory is not pushed into the swap. Idea behind asking the above question was is -XX:MaxDirectMemorySize is the right knob to use to contain the off heap memory. I understand that I have to test as Eric said that I might get outOfMemoryError issue. Or are there any other better options available for handling such situations? On Tue, Apr 21, 2020 at 9:52 PM Reid Pinchback <rpinchb...@tripadvisor.com<mailto:rpinchb...@tripadvisor.com>> wrote: Note that from a performance standpoint, it’s hard to see a reason to care about releasing the memory unless you are co-tenanting C* with something else that’s significant in its memory demands, and significant on a schedule anti-correlated with when C* needs that memory. If you aren’t doing that, then conceivably the only other time you’d care is if you are seeing read or write stalls on disk I/O because O/S buffer cache is too small. But if you were getting a lot of impact from stalls, then it would mean C* was very busy… and if it’s very busy then it’s likely using it’s buffers as they are intended. From: HImanshu Sharma <himanshusharma0...@gmail.com<mailto:himanshusharma0...@gmail.com>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Date: Saturday, April 18, 2020 at 2:06 AM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Subject: Re: Impact of setting low value for flag -XX:MaxDirectMemorySize Message from External Sender From the codebase as much I understood, if once a buffer is being allocated, then it is not freed and added to a recyclable pool. When a new request comes effort is made to fetch memory from recyclable pool and if is not available new allocation request is made. And while allocating a new request if memory limit is breached then we get this oom error. I would like to know is my understanding correct If what I am thinking is correct, is there way we can get this buffer pool reduced when there is low traffic because what I have observed in my system this memory remains static even if there is no traffic. Regards Manish On Sat, Apr 18, 2020 at 11:13 AM Erick Ramirez <erick.rami...@datastax.com<mailto:erick.rami...@datastax.com>> wrote: Like most things, it depends on (a) what you're allowing and (b) how much your nodes require. MaxDirectMemorySize is the upper-bound for off-heap memory used for the direct byte buffer. C* uses it for Netty so if your nodes are busy servicing requests, they'd have more IO threads consuming memory. During low traffic periods, there's less memory allocated to service requests and they eventually get freed up by GC tasks. But if traffic volumes are high, memory doesn't get freed up quick enough so the max is reached. When this happens, you'll see OOMs like "OutOfMemoryError: Direct buffer memory" show up in the logs. You can play around with different values but make sure you test it exhaustively before trying it out in production. Cheers! GOT QUESTIONS? Apache Cassandra experts from the community and DataStax have answers! Share your expertise on https://community.datastax.com/<https://urldefense.proofpoint.com/v2/url?u=https-3A__community.datastax.com_&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=apwt6kYkqM08s5ZleVol85wvX321I_IHDfnOvY63Bys&s=-61m1DcTl2BYpbQu-d9iHpsXBgdyaQg0E_hfRoCbHvQ&e=>.