Hi Bob , We did not use the reaper(yet) just trying to run tests against the new C* v3 cluster . The nodes keeps crashing all the time and these are the error we are getting . Any other ideas ? Thanks! Roy
On Wed, Jan 16, 2019 at 8:51 AM Bobbie Haynes <haynes30...@gmail.com> wrote: > Hi Roy, > I don't think the Memory Leak issue is related to MAP errors > .I was also using Reaper in our cluster.I have seen Memory Leak issue (ERROR > [Reference-Reaper:1] 2019-01-14 00:03:46,469 Ref.java:224 - LEAK DETECTED) > when some of SStables got corrupted because of disk space issue we had when > compactions were running. > I guess you have to report this Memory Leak issue to Reaper tool JIRA. > > Thanks, > Bob > > On Mon, Jan 14, 2019 at 8:44 AM Roy Burstein <burstein....@gmail.com> > wrote: > >> Hi , >> >> We are testing C* 3.11.3 and we have mapping issue and possibly leaked >> memory. >> It might be related to our configuration,any ideas would be helpful . >> >> >> >> Cassandra version: 3.11.3 >> OS: CentOS Linux release 7.4.1708 (Core) >> Kernel: 3.10.0-957.1.3.el7.x86_64 >> JDK: jdk1.8.0_131 >> Heap: same errors with 16GB / 32GB / 64GB. >> >> *We are seeing this errors in production:* >> >> *java.io.IOException: Map failed:* >> >> ERROR [CompactionExecutor:5017] 2019-01-14 00:02:04,763 >> CassandraDaemon.java:228 - Exception in thread >> Thread[CompactionExecutor:5017,1,main] >> org.apache.cassandra.io.FSReadError: java.io.IOException: Map failed >> at >> org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:157) >> ~[apache-cassandra-3.11.3.jar:3.11.3] >> at >> org.apache.cassandra.io.util.MmappedRegions$State.add(MmappedRegions.java:310) >> ~[apache-cassandra-3.11.3.jar:3.11.3] >> at >> org.apache.cassandra.io.util.MmappedRegions$State.access$400(MmappedRegions.java:246) >> ~[apache-cassandra-3.11.3.jar:3.11.3] >> at >> org.apache.cassandra.io.util.MmappedRegions.updateState(MmappedRegions.java:181) >> ~[apache-cassandra-3.11.3.jar:3.11.3] >> at >> org.apache.cassandra.io.util.MmappedRegions.<init>(MmappedRegions.java:73) >> ~[apache-cassandra-3.11.3.jar:3.11.3] >> at >> org.apache.cassandra.io.util.MmappedRegions.<init>(MmappedRegions.java:61) >> ~[apache-cassandra-3.11.3.jar:3.11.3] >> at >> org.apache.cassandra.io.util.MmappedRegions.map(MmappedRegions.java:104) >> ~[apache-cassandra-3.11.3.jar:3.11.3] >> at >> org.apache.cassandra.io.util.FileHandle$Builder.complete(FileHandle.java:362) >> ~[apache-cassandra-3.11.3.jar:3.11.3] >> at >> org.apache.cassandra.io.sstable.format.big.BigTableWriter.openEarly(BigTableWriter.java:290) >> ~[apache-cassandra-3.11.3.jar:3.11.3] >> at >> org.apache.cassandra.io.sstable.SSTableRewriter.maybeReopenEarly(SSTableRewriter.java:179) >> ~[apache-cassandra-3.11.3.jar:3.11.3] >> at >> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:134) >> ~[apache-cassandra-3.11.3.jar:3.11.3] >> at >> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:65) >> ~[apache-cassandra-3.11.3.jar:3.11.3] >> at >> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:142) >> ~[apache-cassandra-3.11.3.jar:3.11.3] >> at >> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:201) >> ~[apache-cassandra-3.11.3.jar:3.11.3] >> at >> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) >> ~[apache-cassandra-3.11.3.jar:3.11.3] >> at >> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85) >> ~[apache-cassandra-3.11.3.jar:3.11.3] >> at >> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) >> ~[apache-cassandra-3.11.3.jar:3.11.3] >> at >> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:274) >> ~[apache-cassandra-3.11.3.jar:3.11.3] >> at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) >> ~[na:1.8.0_131] >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >> ~[na:1.8.0_131] >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >> ~[na:1.8.0_131] >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >> [na:1.8.0_131] >> at >> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) >> [apache-cassandra-3.11.3.jar:3.11.3] >> at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131] >> Caused by: java.io.IOException: Map failed >> at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:940) >> ~[na:1.8.0_131] >> at >> org.apache.cassandra.io.util.ChannelProxy.map(ChannelProxy.java:153) >> ~[apache-cassandra-3.11.3.jar:3.11.3] >> ... 23 common frames omitted >> Caused by: java.lang.OutOfMemoryError: Map failed >> at sun.nio.ch.FileChannelImpl.map0(Native Method) ~[na:1.8.0_131] >> at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:937) >> ~[na:1.8.0_131] >> ... 24 common frames omitted >> >> *LEAK DETECTED:* >> >> ERROR [Reference-Reaper:1] 2019-01-14 00:03:46,469 Ref.java:224 - LEAK >> DETECTED: a reference >> (org.apache.cassandra.utils.concurrent.Ref$State@6a4ef142) to class >> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1651696741:Memory@[6b91a27c5290..6b91a27de290) >> was not released before the reference was garbage collected >> ERROR [Reference-Reaper:1] 2019-01-14 00:03:46,520 Ref.java:224 - LEAK >> DETECTED: a reference >> (org.apache.cassandra.utils.concurrent.Ref$State@6c458f8a) to class >> org.apache.cassandra.io.util.FileHandle$Cleanup@1179238225:/var/lib/cassandra/data/disk1/sessions_rawdata/sessions_v2_2019_01_13-19be8e90037011e9a45847402874bbd7/mc-1209-big-Index.db >> was not released before the reference was garbage collected >> ERROR [Reference-Reaper:1] 2019-01-14 00:03:46,520 Ref.java:224 - LEAK >> DETECTED: a reference >> (org.apache.cassandra.utils.concurrent.Ref$State@5b90823b) to class >> org.apache.cassandra.io.util.MmappedRegions$Tidier@783549664:/var/lib/cassandra/data/disk1/sessions_rawdata/sessions_v2_2019_01_13-19be8e90037011e9a45847402874bbd7/mc-1209-big-Data.db >> was not released before the reference was garbage collected >> ERROR [Reference-Reaper:1] 2019-01-14 00:03:46,520 Ref.java:224 - LEAK >> DETECTED: a reference >> (org.apache.cassandra.utils.concurrent.Ref$State@6ecdf763) to class >> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$Tidy@1710583516:[Memory@[0..3e24), >> Memory@[0..45e88)] was not released before the reference was garbage >> collected >> >> >> *Limits of Cassandra process:* >> >> [root@cass063 ~ ]# cat /proc/`ps -ef | grep CassandraDaemon | grep -v grep >> | awk '\{print $2}'`/limits >> Limit Soft Limit Hard Limit Units >> Max cpu time unlimited unlimited seconds >> Max file size unlimited unlimited bytes >> Max data size unlimited unlimited bytes >> Max stack size 8388608 unlimited bytes >> Max core file size 0 unlimited bytes >> Max resident set unlimited unlimited bytes >> Max processes 32768 32768 >> processes >> Max open files 100000 100000 files >> Max locked memory unlimited unlimited bytes >> Max address space unlimited unlimited bytes >> Max file locks unlimited unlimited locks >> Max pending signals 766985 766985 signals >> Max msgqueue size 819200 819200 bytes >> Max nice priority 0 0 >> Max realtime priority 0 0 >> Max realtime timeout unlimited unlimited us >> >> >> >> *max_map_count parameter on OS:* >> >> [root@cass063 ~]# sysctl vm.max_map_count >> vm.max_map_count = 1073741824 >> >> >> >> >> *cassandra.yaml:* >> >> cluster_name: 'Cass Cluster' >> num_tokens: 256 >> hinted_handoff_enabled: false >> max_hint_window_in_ms: 10800000 >> hinted_handoff_throttle_in_kb: 1024 >> max_hints_delivery_threads: 2 >> hints_directory: /var/lib/cassandra/hints >> hints_flush_period_in_ms: 10000 >> max_hints_file_size_in_mb: 128 >> batchlog_replay_throttle_in_kb: 1024 >> authenticator: AllowAllAuthenticator >> authorizer: AllowAllAuthorizer >> role_manager: CassandraRoleManager >> roles_validity_in_ms: 2000 >> permissions_validity_in_ms: 2000 >> credentials_validity_in_ms: 2000 >> partitioner: org.apache.cassandra.dht.Murmur3Partitioner >> data_file_directories: >> - /var/lib/cassandra/data/disk1 >> commitlog_directory: /var/lib/cassandra/data/disk1/commitlog >> cdc_enabled: false >> disk_failure_policy: stop >> commit_failure_policy: stop >> prepared_statements_cache_size_mb: >> thrift_prepared_statements_cache_size_mb: >> key_cache_size_in_mb: 0 >> key_cache_save_period: 3600 >> row_cache_size_in_mb: 0 >> row_cache_save_period: 0 >> counter_cache_size_in_mb: >> counter_cache_save_period: 7200 >> saved_caches_directory: /var/lib/cassandra/data/disk1/saved_caches >> commitlog_sync: periodic >> commitlog_sync_period_in_ms: 10000 >> commitlog_segment_size_in_mb: 32 >> seed_provider: >> - class_name: org.apache.cassandra.locator.SimpleSeedProvider >> parameters: >> - seeds: "10.110.30.1,10.110.30.2,10.110.30.3" >> concurrent_reads: 48 >> concurrent_writes: 96 >> concurrent_counter_writes: 32 >> concurrent_materialized_view_writes: 32 >> file_cache_size_in_mb: 10240 >> memtable_offheap_space_in_mb: 10240 >> memtable_cleanup_threshold: 0.1 >> memtable_allocation_type: offheap_buffers >> commitlog_total_space_in_mb: 8192 >> memtable_flush_writers: 8 >> index_summary_capacity_in_mb: >> index_summary_resize_interval_in_minutes: 60 >> trickle_fsync: true >> trickle_fsync_interval_in_kb: 10240 >> storage_port: 7000 >> ssl_storage_port: 7001 >> listen_address: 10.106.62.34 >> start_native_transport: true >> native_transport_port: 9042 >> start_rpc: false >> rpc_address: 0.0.0.0 >> rpc_port: 9160 >> broadcast_rpc_address: 10.106.62.34 >> rpc_keepalive: true >> rpc_server_type: hsha >> rpc_max_threads: 128 >> thrift_framed_transport_size_in_mb: 15 >> incremental_backups: false >> snapshot_before_compaction: false >> auto_snapshot: true >> column_index_size_in_kb: 64 >> column_index_cache_size_in_kb: 2 >> concurrent_compactors: 32 >> compaction_throughput_mb_per_sec: 500 >> sstable_preemptive_open_interval_in_mb: 50 >> stream_throughput_outbound_megabits_per_sec: 0 >> read_request_timeout_in_ms: 10000 >> range_request_timeout_in_ms: 10000 >> write_request_timeout_in_ms: 60000 >> counter_write_request_timeout_in_ms: 10000 >> cas_contention_timeout_in_ms: 1000 >> truncate_request_timeout_in_ms: 60000 >> request_timeout_in_ms: 10000 >> slow_query_log_timeout_in_ms: 500 >> cross_node_timeout: false >> phi_convict_threshold: 12 >> endpoint_snitch: GossipingPropertyFileSnitch >> dynamic_snitch_update_interval_in_ms: 100 >> dynamic_snitch_reset_interval_in_ms: 600000 >> dynamic_snitch_badness_threshold: 0.5 >> request_scheduler: org.apache.cassandra.scheduler.NoScheduler >> server_encryption_options: >> internode_encryption: none >> keystore: conf/.keystore >> keystore_password: cassandra >> truststore: conf/.truststore >> truststore_password: cassandra >> client_encryption_options: >> enabled: false >> optional: false >> keystore: conf/.keystore >> keystore_password: cassandra >> internode_compression: dc >> inter_dc_tcp_nodelay: false >> tracetype_query_ttl: 86400 >> tracetype_repair_ttl: 604800 >> enable_user_defined_functions: false >> enable_scripted_user_defined_functions: false >> enable_materialized_views: true >> windows_timer_interval: 1 >> transparent_data_encryption_options: >> enabled: false >> chunk_length_kb: 64 >> cipher: AES/CBC/PKCS5Padding >> key_alias: testing:1 >> key_provider: >> - class_name: org.apache.cassandra.security.JKSKeyProvider >> parameters: >> - keystore: conf/.keystore >> keystore_password: cassandra >> store_type: JCEKS >> key_password: cassandra >> tombstone_warn_threshold: 1000 >> tombstone_failure_threshold: 100000 >> batch_size_warn_threshold_in_kb: 5 >> batch_size_fail_threshold_in_kb: 50 >> unlogged_batch_across_partitions_warn_threshold: 10 >> compaction_large_partition_warning_threshold_mb: 10 >> gc_warn_threshold_in_ms: 1000 >> back_pressure_enabled: false >> back_pressure_strategy: >> - class_name: org.apache.cassandra.net.RateBasedBackPressure >> parameters: >> - high_ratio: 0.90 >> factor: 5 >> flow: FAST >> >> >> >> *A lot of maps, 200K maps of cassandra process,*: >> >> [root@cass063 ~]# wc -l /proc/`ps -ef | grep CassandraDaemon | grep -v grep >> | awk '{print $2}'`/maps >> 239587 /proc/202664/maps >> >> Thanks, >> Roy >> >