Cut your memtable thresholds (throughput and ops) in half. See "describe keyspaces" and "update keyspace" in the cli.
On Wed, Dec 8, 2010 at 9:18 AM, Amin Sakka, Novapost <[email protected] > wrote: > Thanks for your answer Aaron, > > I'm now on the RC1, I have no longer the ActiveCount error, however my > nodes still dying under bulk insertion. > > I have modified my nodes configuration (all of them has now 2GB Heap size). > The nodes still under heavy pressure and they dies after a random timeout > (sometimes after 10 minutes of insertion > and sometimes after 50 minutes). > I want to point that I'm inserting rows in 4 different columns families at > the same time and that the rows size is too little (few KiloBytes). > I've attached here my cassandra.yaml configuration file. > Can you help me please to solve this issue? > > Thanks! > > Here is some of my log output: > > DEBUG [MutationStage:27] 2010-12-08 15:31:19,214 > RowMutationVerbHandler.java (line 78) RowMutation(keyspace='SAE', > key='6163636f756e743936353a726566393230', > modifications=[ColumnFamily(Document > [6465736372697074696f6e:false:4...@1291818633293000 > ,646f63756d656e744964:false:3...@1291818633293000 > ,7265666572656e6365:false:1...@1291818633293000,])]) applied. Sending > response to 714546@/10.0.100.94 > DEBUG [MutationStage:7] 2010-12-08 15:31:19,214 RowMutationVerbHandler.java > (line 54) Applying RowMutation(keyspace='SAE', > key='6163636f756e743938373a726566313835', > modifications=[ColumnFamily(Document > [6465736372697074696f6e:false:4...@1291818633524000 > ,646f63756d656e744964:false:3...@1291818633524000 > ,7265666572656e6365:false:1...@1291818633524000,])]) > DEBUG [MutationStage:7] 2010-12-08 15:31:19,214 Table.java (line 378) > applying mutation of row 6163636f756e743938373a726566313835 > DEBUG [MutationStage:7] 2010-12-08 15:31:19,214 RowMutationVerbHandler.java > (line 78) RowMutation(keyspace='SAE', > key='6163636f756e743938373a726566313835', > modifications=[ColumnFamily(Document > [6465736372697074696f6e:false:4...@1291818633524000 > ,646f63756d656e744964:false:3...@1291818633524000 > ,7265666572656e6365:false:1...@1291818633524000,])]) applied. Sending > response to 714547@/10.0.100.94 > *DEBUG [ScheduledTasks:1] 2010-12-08 15:31:19,373 GCInspector.java (line > 135) GC for ParNew: 13 ms, 34456296 reclaimed leaving 1465917704 used; max > is 2256404480* > > > > > > 2010/12/1 Aaron Morton <[email protected]> > > Running nodes with different JVM heap sizes would not be recommended >> practice, for many reasons. Nor would I recommend running them with all the >> memory the machine has, it will just lead to the OS swapping the JVM out to >> disk and considerable slow things down. >> >> I would suggest a heap size of 1.5 or 2.0 GB for each node, and have a >> read of the JVM Heap Size section here >> http://wiki.apache.org/cassandra/MemtableThresholds . AFAIK the logs are >> showing your cluster was under heavy GC pressure. >> >> Finally, the ActiveCount error message was a known issue in beta 2. Treat >> yourself and try RC1 :) >> http://www.mail-archive.com/[email protected]/msg06298.html >> >> Aaron >> >> >> >> On 02 Dec, 2010,at 12:33 AM, asakka <[email protected]> wrote: >> >> >> Hello, >> >> I'm making some tests on a data model with 3 CF and 1 SCF, I want to start >> by inserting 1 million rows (my target is to have 1billion rows) . >> I have three nodes cluster (I'm using the same machines with 3GB of RAM >> each , intel core2 duo 1,6GHZ), RF = 2, CL = 1, HEAPSIZE of the seed = 3GO >> (it was 1.5GO, I've doubled it to avoid the heap size exception I had) , >> the >> other two nodes are 1.5GO. >> >> I am using cassandra (V0.7.0-beta2) and Hector (V0.7.0.18) . I'm making >> insertion in batch mode using hector Mutator. >> My disk_access_mode is standard. >> I reduced also my memtable_throughput_in_mb to 64, but the problem >> persists >> and I have the following exception : >> I want to know if it is a configuration or hardware problem ? >> >> INFO [Timer-0] 2010-12-01 10:34:42,124 Gossiper.java (line 196) >> InetAddress >> /10.0.100.215 is now dead. >> INFO [GOSSIP_STAGE:1] 2010-12-01 10:34:44,188 Gossiper.java (line 594) >> Node >> /10.0.100.215 has restarted, now UP again >> INFO [GOSSIP_STAGE:1] 2010-12-01 10:34:44,189 StorageService.java (line >> 643) Node /10.0.100.215 state jump to normal >> INFO [GOSSIP_STAGE:1] 2010-12-01 10:34:44,189 StorageService.java (line >> 650) Will not change my token ownership to /10.0.100.215 >> INFO [HINTED-HANDOFF-POOL:1] 2010-12-01 10:34:44,189 >> HintedHandOffManager.java (line 196) Started hinted handoff for endpoint >> /10.0.100.215 >> INFO [HINTED-HANDOFF-POOL:1] 2010-12-01 10:34:44,189 >> HintedHandOffManager.java (line 252) Finished hinted handoff of 0 rows to >> endpoint /10.0.100.215 >> INFO [GC inspection] 2010-12-01 10:40:29,141 GCInspector.java (line 129) >> GC >> for ParNew: 750 ms, 14693208 reclaimed leaving 2140055192 used; max is >> 3355312128 >> INFO [GC inspection] 2010-12-01 10:40:30,280 GCInspector.java (line 129) >> GC >> for ParNew: 445 ms, 17042288 reclaimed leaving 2178211008 used; max is >> 3355312128 >> INFO [WRITE-/10.0.100.214] 2010-12-01 10:40:31,552 >> OutboundTcpConnection.java (line 115) error writing to /10.0.100.214 >> INFO [GC inspection] 2010-12-01 10:40:32,280 GCInspector.java (line 129) >> GC >> for ParNew: 211 ms, 25550568 reclaimed leaving 2235227312 used; max is >> 3355312128 >> INFO [GC inspection] 2010-12-01 10:40:34,320 GCInspector.java (line 129) >> GC >> for ParNew: 290 ms, 26512896 reclaimed leaving 2277013184 used; max is >> 3355312128 >> INFO [GC inspection] 2010-12-01 10:40:35,950 GCInspectorjava (line 129) GC >> >> for ParNew: 506 ms, 24319976 reclaimed leaving 2303739672 used; max is >> 3355312128 >> INFO [GC inspection] 2010-12-01 10:40:37,202 GCInspector.java (line 129) >> GC >> for ParNew: 462 ms, 31759008 reclaimed leaving 2306914712 used; max is >> 3355312128 >> INFO [GC inspection] 2010-12-01 10:40:42,629 GCInspector.java (line 129) >> GC >> for ParNew: 445 ms, 14769312 reclaimed leaving 2327064920 used; max is >> 3355312128 >> INFO [GC inspection] 2010-12-01 10:40:43,969 GCInspector.java (line 129) >> GC >> for ParNew: 720 ms, 14804208 reclaimed leaving 2366434112 used; max is >> 3355312128 >> INFO [GC inspection] 2010-12-01 10:40:45,372 GCInspector.java (line 129) >> GC >> for ParNew: 325 ms, 23112128 reclaimed leaving 2421032952 used; max is >> 3355312128 >> INFO [GC inspection] 2010-12-01 10:40:47,843 GCInspector.java (line 129) >> GC >> for ParNew: 801 ms, 26014296 reclaimed leaving 2474278880 used; max is >> 3355312128 >> INFO [Timer-0] 2010-12-01 10:41:18,451 Gossiper.java (line 196) >> InetAddress >> /10.0.100.215 is now dead. >> INFO [HINTED-HANDOFF-POOL:1] 2010-12-01 10:41:19,362 >> HintedHandOffManager.java (line 196) Started hinted handoff for endpoint >> /10.0.100.215 >> INFO [HINTED-HANDOFF-POOL:1] 2010-12-01 10:41:19,975 >> HintedHandOffManager.java (line 252) Finished hinted handoff of 0 rows to >> endpoint /10.0.100.215 >> INFO [GOSSIP_STAGE:1] 2010-12-01 10:41:19,506 Gossiper.java (line 580) >> InetAddress /10.0.100.215 is now UP >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:28,873 SSTablejava (line >> >> 145) Deleted /var/lib/cassandra/data/SAE4/Document-e-20-<> >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:28,952 SSTable.java (line >> 145) Deleted /var/lib/cassandra/data/system/LocationInfo-e-148-<> >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,053 SSTable.java (line >> 145) Deleted /var/lib/cassandra/data/SAE4/Account-e-7-<> >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,163 SSTable.java (line >> 145) Deleted /var/lib/cassandra/data/SAE4/Account-e-12-<> >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,274 SSTable.java (line >> 145) Deleted /var/lib/cassandra/data/SAE4/Document-e-13-<> >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,407 SSTable.java (line >> 145) Deleted /var/lib/cassandra/data/SAE4/Account-e-13-<> >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,513 SSTable.java (line >> 145) Deleted /var/lib/cassandra/data/SAE4/Account-e-17-<> >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,545 SSTable.java (line >> 145) Deleted /var/lib/cassandra/data/SAE4/Document-e-7-<> >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,577 SSTable.java (line >> 145) Deleted /var/lib/cassandra/data/system/LocationInfo-e-146-<> >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,776 SSTable.java (line >> 145) Deleted /var/lib/cassandra/data/SAE4/Document-e-2-<> >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,784 SSTable.java (line >> 145) Deleted /var/lib/cassandra/data/system/LocationInfo-e-145-<> >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,882 SSTable.java (line >> 145) Deleted /var/lib/cassandra/data/SAE4/Document-e-19-<> >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,883 SSTable.java (line >> 145) Deleted /var/lib/cassandra/data/system/LocationInfo-e-147-<> >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,884 SSTable.java (line >> 145) Deleted /var/lib/cassandra/data/SAE4/Account-e-14-<> >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,886 SSTable.java (line >> 145) Deleted /var/lib/cassandra/data/SAE4/DocumentByFolder-e-5-<> >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,887 SSTable.java (line >> 145) Deleted /var/lib/cassandra/data/SAE4/DocumentByFolder-e-7-<> >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,887 SSTable.java (line >> 145) Deleted /var/lib/cassandra/data/SAE4/DocumentByFolder-e-8-<> >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,888 SSTablejava (line >> >> 145) Deleted /var/lib/cassandra/data/SAE4/Account-e-16-<> >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,917 SSTable.java (line >> 145) Deleted /var/lib/cassandra/data/SAE4/Account-e-3-<> >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,918 SSTable.java (line >> 145) Deleted /var/lib/cassandra/data/SAE4/DocumentByFolder-e-6-<> >> INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,918 SSTable.java (line >> 145) Deleted /var/lib/cassandra/data/SAE4/Account-e-15-<> >> INFO [GC inspection] 2010-12-01 10:51:07,737 GCInspector.java (line 129) >> GC >> for ParNew: 496 ms, 36105328 reclaimed leaving 274154728 used; max is >> 3355312128 >> INFO [GC inspection] 2010-12-01 10:51:43,742 GCInspector.java (line 129) >> GC >> for ParNew: 3386 ms, 12099384 reclaimed leaving 297145056 used; max is >> 3355312128 >> INFO [GC inspection] 2010-12-01 10:51:45,487 GCInspector.java (line 150) >> Pool Name Active Pending >> INFO [GC inspection] 2010-12-01 10:51:45,706 GCInspector.java (line 156) >> MIGRATION_STAGE 0 0 >> INFO [GC inspection] 2010-12-01 10:51:45,716 GCInspector.java (line 156) >> GOSSIP_STAGE 0 0 >> ERROR [GC inspection] 2010-12-01 10:51:46,313 AbstractCassandraDaemon.java >> (line 88) Fatal exception in thread Thread[GC inspection,5,main] >> java.lang.reflect.UndeclaredThrowableException >> at $Proxy1.getActiveCount(Unknown Source) >> at >> >> org.apache.cassandra.service.GCInspector.logThreadPoolStats(GCInspector.java:156) >> at >> >> org.apache.cassandra.service.GCInspector.logIntervalGCStats(GCInspector.java:136) >> at >> org.apache.cassandra.service.GCInspector.access$000(GCInspector.java:39) >> at org.apache.cassandra.service.GCInspector$1.run(GCInspector.java:93) >> at java.util.TimerThread.mainLoop(Timer.java:512) >> at java.util.TimerThread.run(Timer.java:462) >> Caused by: javax.management.AttributeNotFoundException: No such attribute: >> ActiveCount >> at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:63) >> at >> com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:216) >> at >> >> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:666) >> at >> >> com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638) >> at >> >> javax.managementMBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:263) >> >> ... 7 more >> >> -- >> View this message in context: >> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/GC-Exceptions-and-cluster-nodes-are-dying-tp5791496p5791496.html >> Sent from the [email protected] mailing list archive at >> Nabble.com. >> >> > > > -- > > Amin SAKKA > Research and Development Engineer > 32 rue de Paradis, 75010 Paris > *Tel:* +33 (0)6 34 14 19 25 > *Mail:* [email protected] > *Web:* www.novapost.fr / www.novapost-rh.fr > > > > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
