On Thu, Dec 6, 2018 at 3:39 PM Riccardo Ferrari <ferra...@gmail.com> wrote:
> To be honest I've never seen the OOM in action on those instances. My Xmx > was 8GB just like yours and that let me think you have some process that is > competing for memory, is it? Do you have any cron, any backup, anything > that can trick the OOMKiller ? > Riccardo, As I've mentioned previously, apart from docker running Cassandra on JVM, there is a small number of houskeeping processes, namely cron to trigger log rotation, a log shipping agent, node metrics exporter (prometheus) and some other small things. None of those come close in their memory requirements compared to Cassandra and are routinely pretty low in memory usage reports from atop and similar tools. The overhead of these seems to be minimal. My unresponsiveness was seconds long. This is/was bad becasue gossip > protocol was going crazy by marking nodes down and all the consequences > this can lead in distributed system, think about hints, dynamic snitch, and > whatever depends on node availability ... > Can you share some number about your `tpstats` or system load in general? > Here's some pretty typical tpstats output from one of the nodes: Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 319319724 0 0 ViewMutationStage 0 0 0 0 0 ReadStage 0 0 80006984 0 0 RequestResponseStage 0 0 258548356 0 0 ReadRepairStage 0 0 2707455 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 CompactionExecutor 1 55 1552918 0 0 MemtableReclaimMemory 0 0 4042 0 0 PendingRangeCalculator 0 0 111 0 0 GossipStage 0 0 6343859 0 0 SecondaryIndexManagement 0 0 0 0 0 HintsDispatcher 0 0 226 0 0 MigrationStage 0 0 0 0 0 MemtablePostFlush 0 0 4046 0 0 ValidationExecutor 1 1 1510 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0 4042 0 0 InternalResponseStage 0 0 5890 0 0 AntiEntropyStage 0 0 5532 0 0 CacheCleanupExecutor 0 0 0 0 0 Repair#250 1 1 1 0 0 Native-Transport-Requests 2 0 260447405 0 18 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 HINT 0 MUTATION 1 COUNTER_MUTATION 0 BATCH_STORE 0 BATCH_REMOVE 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0 Speaking of CPU utilization, it is consistently within 30-60% on all nodes (and even less in the night). > On the tuning side I just went through the following article: > https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/config/configRecommendedSettings.html > > No rollbacks, just moving forward! Right now we are upgrading the instance > size to something more recent than m1.xlarge (for many different reasons, > including security, ECU and network).Nevertheless it might be a good idea > to upgrade to the 3.X branch to leverage on better off-heap memory > management. > One thing we have noticed very recently is that our nodes are indeed running low on memory. It even seems now that the IO is a side effect of impending OOM, not the other way round as we have thought initially. After a fresh JVM start the memory allocation looks roughly like this: total used free shared buffers cached Mem: 14G 14G 173M 1.1M 12M 3.2G -/+ buffers/cache: 11G 3.4G Swap: 0B 0B 0B Then, within a number of days, the allocated disk cache shrinks all the way down to unreasonable numbers like only 150M. At the same time "free" stays at the original level and "used" grows all the way up to 14G. Shortly after that the node becomes unavailable because of the IO and ultimately after some time the JVM gets killed. Most importantly, the resident size of JVM process stays at around 11-12G all the time, like it was shortly after the start. How can we find where the rest of the memory gets allocated? Is it just some sort of malloc fragmentation? As we are running a relatively recent version of JDK, we've tried to use the option -Djdk.nio.maxCachedBufferSize=262144 on one of the nodes, as suggested in this issue: https://issues.apache.org/jira/browse/CASSANDRA-13931 But we didn't see any improvement. Also, the expectation is if it would be the issue in the first place, the resident size of JVM process would grow at the same rate as available memory is shrinking, correct? Another thing we didn't find the answer so far is why within JVM heap.used (<= 6GB) never reaches heap.committed = 8GB. Any ideas? Regards, -- Alex