Hi Paulo, Which metric should I watch for this ?
[root@avesterra-prod-1 ~]# rpm -qa| grep datastax datastax-ddc-3.2.1-1.noarch datastax-ddc-tools-3.2.1-1.noarch [root@avesterra-prod-1 ~]# cassandra -v 3.2.1 [root@avesterra-prod-1 ~]# [root@avesterra-prod-1 ~]# nodetool -u cassandra -pw '########' tpstats Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 13609 0 0 ViewMutationStage 0 0 0 0 0 ReadStage 0 0 0 0 0 RequestResponseStage 0 0 8 0 0 ReadRepairStage 0 0 0 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 CompactionExecutor 1 1 17556 0 0 MemtableReclaimMemory 0 0 38 0 0 PendingRangeCalculator 0 0 8 0 0 GossipStage 0 0 118094 0 0 SecondaryIndexManagement 0 0 0 0 0 HintsDispatcher 0 0 0 0 0 MigrationStage 0 0 0 0 0 MemtablePostFlush 0 0 55 0 0 PerDiskMemtableFlushWriter_0 0 0 38 0 0 ValidationExecutor 0 0 0 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0 38 0 0 InternalResponseStage 0 0 0 0 0 AntiEntropyStage 0 0 0 0 0 CacheCleanupExecutor 0 0 0 0 0 Native-Transport-Requests 0 0 0 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 HINT 0 MUTATION 0 COUNTER_MUTATION 0 BATCH_STORE 0 BATCH_REMOVE 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0 [root@avesterra-prod-1 ~]# Thanks a lot, Mohamed. On Mon, Mar 14, 2016 at 8:22 AM, Paulo Motta <pauloricard...@gmail.com> wrote: > Can you check with nodetool tpstats if bloom filter mem space utilization > is very large/ramping up before the node gets killed? You could be hitting > CASSANDRA-11344. > > 2016-03-12 19:43 GMT-03:00 Mohamed Lrhazi <mohamed.lrh...@georgetown.edu>: > >> In my case, all nodes seem to be constantly logging messages like these: >> >> DEBUG [GossipStage:1] 2016-03-12 17:41:19,123 FailureDetector.java:456 - >> Ignoring interval time of 2000928319 for /10.212.18.170 >> >> What does that mean? >> >> Thanks a lot, >> Mohamed. >> >> >> On Sat, Mar 12, 2016 at 5:39 PM, Mohamed Lrhazi < >> mohamed.lrh...@georgetown.edu> wrote: >> >>> Oh wow, similar behavior with different version all together!! >>> >>> On Sat, Mar 12, 2016 at 5:28 PM, ssiv...@gmail.com <ssiv...@gmail.com> >>> wrote: >>> >>>> Hi, I'll duplicate here my email with the same issue >>>> >>>> " >>>> >>>> >>>> *I have 7 nodes of C* v2.2.5 running on CentOS 7 and using jemalloc for >>>> dynamic storage allocation. Use only one keyspace and one table with >>>> Leveled compaction strategy. I've loaded ~500 GB of data into the cluster >>>> with replication factor equals to 3 and waiting until compaction is >>>> finished. But during compaction each of the C* nodes allocates all the >>>> available memory (~128GB) and just stops its process. This is a known bug ? >>>> *" >>>> >>>> >>>> On 03/13/2016 12:56 AM, Mohamed Lrhazi wrote: >>>> >>>> Hello, >>>> >>>> We installed Datastax community edition, on 8 nodes, RHEL7. We inserted >>>> some 7 billion rows into a pretty simple table. the inserts seem to have >>>> completed without issues. but ever since, we find that the nodes reliably >>>> run out of RAM after few hours, without any user activity at all. No reads >>>> nor write are sent at all. What should we look for to try and identify >>>> root cause? >>>> >>>> >>>> [root@avesterra-prod-1 ~]# cat /etc/redhat-release >>>> Red Hat Enterprise Linux Server release 7.2 (Maipo) >>>> [root@avesterra-prod-1 ~]# rpm -qa| grep datastax >>>> datastax-ddc-3.2.1-1.noarch >>>> datastax-ddc-tools-3.2.1-1.noarch >>>> [root@avesterra-prod-1 ~]# >>>> >>>> The nodes had 8 GB RAM, which we doubled twice and now are trying with >>>> 40GB... they still manage to consume it all and cause oom_killer to kick >>>> in. >>>> >>>> Pretty much all the settings are the default ones the installation >>>> created. >>>> >>>> Thanks, >>>> Mohamed. >>>> >>>> >>>> -- >>>> Thanks, >>>> Serj >>>> >>>> >>> >> >