Have you checked the effective limits of a running CS process? Is CS run as Cassandra? Just to rule out missing file perms.
Am 06.04.2017 12:24 schrieb "Cogumelos Maravilha" < cogumelosmaravi...@sapo.pt>: >From cassandra.yaml: hints_directory: /mnt/cassandra/hints data_file_directories: - /mnt/cassandra/data commitlog_directory: /mnt/cassandra/commitlog saved_caches_directory: /mnt/cassandra/saved_caches drwxr-xr-x 3 cassandra cassandra 23 Apr 5 16:03 mnt/ drwxr-xr-x 6 cassandra cassandra 68 Apr 5 16:17 ./ drwxr-xr-x 3 cassandra cassandra 23 Apr 5 16:03 ../ drwxr-xr-x 2 cassandra cassandra 80 Apr 6 10:07 commitlog/ drwxr-xr-x 8 cassandra cassandra 124 Apr 5 16:17 data/ drwxr-xr-x 2 cassandra cassandra 72 Apr 5 16:20 hints/ drwxr-xr-x 2 cassandra cassandra 49 Apr 5 20:17 saved_caches/ cassand+ 2267 1 99 10:18 ? 00:02:56 java -Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities -XX:Threa... /dev/mapper/um_vg-xfs_lv 885G 27G 858G 4% /mnt On /etc/security/limits.conf * - memlock unlimited * - nofile 100000 * - nproc 32768 * - as unlimited On /etc/security/limits.d/cassandra.conf cassandra - memlock unlimited cassandra - nofile 100000 cassandra - as unlimited cassandra - nproc 32768 On /etc/sysctl.conf vm.max_map_count = 1048575 On /etc/systcl.d/cassanda.conf vm.max_map_count = 1048575 net.ipv4.tcp_keepalive_time=600 On /etc/pam.d/su ... session required pam_limits.so ... Distro is the currently Ubuntu LTS. Thanks On 04/06/2017 10:39 AM, benjamin roth wrote: Cassandra cannot write an SSTable to disk. Are you sure the disk/volume where SSTables reside (normally /var/lib/cassandra/data) is writeable for the CS user and has enough free space? The CDC warning also implies that. The other warnings indicate you are probably not running CS as root and you did not set an appropriate limit for max open files. Running out of open files can also be a reason for the IO error. 2017-04-06 11:34 GMT+02:00 Cogumelos Maravilha <cogumelosmaravi...@sapo.pt>: > Hi list, > > I'm using C* 3.10 in a 6 nodes cluster RF=2. All instances type > i3.xlarge (AWS) with 32GB, 2 cores and SSD LVM XFS formated 885G. I have > one node that is always dieing and I don't understand why. Can anyone > give me some hints please. All nodes using the same configuration. > > Thanks in advance. > > INFO [IndexSummaryManager:1] 2017-04-06 05:22:18,352 > IndexSummaryRedistribution.java:75 - Redistributing index summaries > ERROR [MemtablePostFlush:22] 2017-04-06 06:00:26,800 > CassandraDaemon.java:229 - Exception in thread > Thread[MemtablePostFlush:22,5,main] > org.apache.cassandra.io.FSWriteError: java.io.IOException: Input/output > error > at > org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyIn > ternal(SequentialWriter.java:173) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io.util.SequentialWriter.syncInternal(S > equentialWriter.java:185) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io.compress.CompressedSequentialWriter. > access$100(CompressedSequentialWriter.java:38) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io.compress.CompressedSequentialWriter$ > TransactionalProxy.doPrepare(CompressedSequentialWriter.java:307) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.utils.concurrent.Transactional$Abstract > Transactional.prepareToCommit(Transactional.java:173) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io.util.SequentialWriter.prepareToCommi > t(SequentialWriter.java:358) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io.sstable.format.big.BigTableWriter$Tr > ansactionalProxy.doPrepare(BigTableWriter.java:367) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.utils.concurrent.Transactional$Abstract > Transactional.prepareToCommit(Transactional.java:173) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io.sstable.format.SSTableWriter.prepare > ToCommit(SSTableWriter.java:281) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io.sstable.SimpleSSTableMultiWriter.pre > pareToCommit(SimpleSSTableMultiWriter.java:101) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtabl > e(ColumnFamilyStore.java:1153) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFa > milyStore.java:1086) > ~[apache-cassandra-3.10.jar:3.10] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool > Executor.java:1142) > ~[na:1.8.0_121] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo > lExecutor.java:617) > [na:1.8.0_121] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$ > threadLocalDeallocator$0(NamedThreadFactory.java:79) > [apache-cassandra-3.10.jar:3.10] > at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121] > Caused by: java.io.IOException: Input/output error > at sun.nio.ch.FileDispatcherImpl.force0(Native Method) ~[na:1.8.0_121] > at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76) > ~[na:1.8.0_121] > at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:388) > ~[na:1.8.0_121] > at org.apache.cassandra.utils.SyncUtil.force(SyncUtil.java:158) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyIn > ternal(SequentialWriter.java:169) > ~[apache-cassandra-3.10.jar:3.10] > ... 15 common frames omitted > INFO [IndexSummaryManager:1] 2017-04-06 06:22:18,366 > IndexSummaryRedistribution.java:75 - Redistributing index summaries > ERROR [MemtablePostFlush:31] 2017-04-06 06:39:19,525 > CassandraDaemon.java:229 - Exception in thread > Thread[MemtablePostFlush:31,5,main] > org.apache.cassandra.io.FSWriteError: java.io.IOException: Input/output > error > at > org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyIn > ternal(SequentialWriter.java:173) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io.util.SequentialWriter.syncInternal(S > equentialWriter.java:185) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io.compress.CompressedSequentialWriter. > access$100(CompressedSequentialWriter.java:38) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io.compress.CompressedSequentialWriter$ > TransactionalProxy.doPrepare(CompressedSequentialWriter.java:307) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.utils.concurrent.Transactional$Abstract > Transactional.prepareToCommit(Transactional.java:173) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io.util.SequentialWriter.prepareToCommi > t(SequentialWriter.java:358) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io.sstable.format.big.BigTableWriter$Tr > ansactionalProxy.doPrepare(BigTableWriter.java:367) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.utils.concurrent.Transactional$Abstract > Transactional.prepareToCommit(Transactional.java:173) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io.sstable.format.SSTableWriter.prepare > ToCommit(SSTableWriter.java:281) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io.sstable.SimpleSSTableMultiWriter.pre > pareToCommit(SimpleSSTableMultiWriter.java:101) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtabl > e(ColumnFamilyStore.java:1153) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFa > milyStore.java:1086) > ~[apache-cassandra-3.10.jar:3.10] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool > Executor.java:1142) > ~[na:1.8.0_121] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo > lExecutor.java:617) > [na:1.8.0_121] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$ > threadLocalDeallocator$0(NamedThreadFactory.java:79) > [apache-cassandra-3.10.jar:3.10] > at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121] > Caused by: java.io.IOException: Input/output error > at sun.nio.ch.FileDispatcherImpl.force0(Native Method) ~[na:1.8.0_121] > at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76) > ~[na:1.8.0_121] > at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:388) > ~[na:1.8.0_121] > at org.apache.cassandra.utils.SyncUtil.force(SyncUtil.java:158) > ~[apache-cassandra-3.10.jar:3.10] > at > org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyIn > ternal(SequentialWriter.java:169) > ~[apache-cassandra-3.10.jar:3.10] > ... 15 common frames omitted > INFO [main] 2017-04-06 07:11:57,289 YamlConfigurationLoader.java:89 - > Configuration location: file:/etc/cassandra/cassandra.yaml > > > Some ERRORs messages: > > ERROR [MemtablePostFlush:2] 2017-04-05 23:35:46,339 > CassandraDaemon.java:229 - Exception in thread > Thread[MemtablePostFlush:2,5,main] > ERROR [MemtablePostFlush:3] 2017-04-05 23:44:08,471 > CassandraDaemon.java:229 - Exception in thread > Thread[MemtablePostFlush:3,5,main] > ERROR [MemtablePostFlush:4] 2017-04-05 23:54:41,224 > CassandraDaemon.java:229 - Exception in thread > Thread[MemtablePostFlush:4,5,main] > ERROR [MessagingService-Incoming-/10.0.120.52] 2017-04-06 03:19:13,453 > CassandraDaemon.java:229 - Exception in thread > Thread[MessagingService-Incoming-/10.0.120.52,5,main] > ERROR [epollEventLoopGroup-2-6] 2017-04-06 03:24:41,006 > CassandraDaemon.java:229 - Exception in thread > Thread[epollEventLoopGroup-2-6,10,main] > ERROR [Native-Transport-Requests-36] 2017-04-06 03:25:45,915 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-49] 2017-04-06 03:25:45,915 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [IndexSummaryManager:1] 2017-04-06 03:25:45,915 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-69] 2017-04-06 03:25:45,916 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-46] 2017-04-06 03:26:18,465 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [SharedPool-Worker-136] 2017-04-06 03:26:18,465 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-156] 2017-04-06 03:26:18,465 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [SharedPool-Worker-92] 2017-04-06 03:26:24,696 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-48] 2017-04-06 03:26:24,696 ?:? - JVM > state determined to be unstable. Exiting forcefully due to: > ERROR [Native-Transport-Requests-66] 2017-04-06 03:26:55,808 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-77] 2017-04-06 03:26:55,808 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [GossipTasks:1] 2017-04-06 03:26:55,808 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-133] 2017-04-06 03:26:55,808 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-135] 2017-04-06 03:26:55,808 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [ScheduledFastTasks:1] 2017-04-06 03:26:55,808 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-70] 2017-04-06 03:27:11,569 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [IndexSummaryManager:1] 2017-04-06 03:27:17,821 > CassandraDaemon.java:229 - Exception in thread > Thread[IndexSummaryManager:1,1,main] > ERROR [Native-Transport-Requests-103] 2017-04-06 03:27:24,049 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-69] 2017-04-06 03:27:24,049 > SEPWorker.java:145 - Failed to execute task, unexpected exception killed > worker: {} > ERROR [SharedPool-Worker-98] 2017-04-06 03:27:24,049 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [MessagingService-Incoming-/10.0.120.52] 2017-04-06 03:27:55,079 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [epollEventLoopGroup-2-5] 2017-04-06 03:27:55,079 > JVMStabilityInspector.java:142 - JVM state determined to be unstable. > Exiting forcefully due to: > ERROR [Native-Transport-Requests-64] 2017-04-06 03:28:43,285 > SEPWorker.java:145 - Failed to execute task, unexpected exception killed > worker: {} > ERROR [MemtablePostFlush:22] 2017-04-06 06:00:26,800 > CassandraDaemon.java:229 - Exception in thread > Thread[MemtablePostFlush:22,5,main] > ERROR [MemtablePostFlush:31] 2017-04-06 06:39:19,525 > CassandraDaemon.java:229 - Exception in thread > Thread[MemtablePostFlush:31,5,main] > > Also some WARNs: > > WARN [main] 2017-04-06 09:26:49,725 CLibrary.java:178 - Unable to lock > JVM memory (ENOMEM). This can result in part of the JVM being swapped > out, especially with mmapped I/O enabled. Increase RLIMIT_MEMLOCK or run > Cassandra as root. > > WARN [main] 2017-04-06 09:25:07,355 StartupChecks.java:157 - JMX is not > enabled to receive remote connections. Please see cassandra-env.sh for > more info. > > WARN [main] 2017-04-06 09:25:07,369 SigarLibrary.java:174 - Cassandra > server running in degraded mode. Is swap disabled? : true, Address > space adequate? : true, nofile limit adequate? : false, nproc limit > adequate? : true > > WARN [main] 2017-04-06 09:25:07,091 DatabaseDescriptor.java:493 - Small > cdc volume detected at /var/lib/cassandra/cdc_raw; setting > cdc_total_space_in_mb to 2502. You can override this in cassandra.yaml > > >