Hi Bhuvan, how big are your current commit logs in the failed node, and what are the sizes MAX_HEAP_SIZE and HEAP_NEWSIZE?
Also the values of following properties in cassandra.yaml?? memtable_allocation_type memtable_cleanup_threshold memtable_flush_writers memtable_heap_space_in_mb memtable_offheap_space_in_mb Regards, Mike Yeap On Sun, May 29, 2016 at 6:18 PM, Bhuvan Rawal <bhu1ra...@gmail.com> wrote: > Hi, > > We are running a 6 Node cluster in 2 DC on DSC 3.0.3, with 3 Node each. > One of the node was showing UNREACHABLE on other nodes in nodetool > describecluster and on that node it was showing all others UNREACHABLE and > as a measure we restarted the node. > > But on doing that it is stuck possibly at with these messages in > system.log: > > DEBUG [SlabPoolCleaner] 2016-05-29 14:07:28,156 ColumnFamilyStore.java:829 > - Enqueuing flush of batches: 226784704 (11%) on-heap, 0 (0%) off-heap > DEBUG [main] 2016-05-29 14:07:28,576 CommitLogReplayer.java:415 - > Replaying /commitlog/data/CommitLog-6-1464508993391.log (CL version 6, > messaging version 10, compression null) > DEBUG [main] 2016-05-29 14:07:28,781 ColumnFamilyStore.java:829 - > Enqueuing flush of batches: 207333510 (10%) on-heap, 0 (0%) off-heap > > MemtablePostFlush / MemtableFlushWriter stages where it is stuck with > pending messages. > This has been the status of them as per *nodetool tpstats *for long. > MemtablePostFlush Active - 1 pending - 52 > completed - 16 > MemtableFlushWriter Active - 2 pending - 13 > completed - 15 > > > We restarted the node by setting log level to TRACE but in vain. What > could be a possible contingency plan in such a scenario? > > Best Regards, > Bhuvan > >