I am running an 8 node cassandra cluster with each node on its own dedicated VM.

My app very quickly populates the database with about 100,000 rows of data
(each row is about 100K bytes) times the number of nodes in my cluster so
there's about 100,000 rows of data on each node (seems very evenly distributed).

I have been running my app fairly successfully but today changed the replication
factor from 1 to 3. (I first took down the servers, nuked their data
directories, copied over the new storage-conf.xml to each node, then restarted
the servers.)  My app begins by populating the database with fresh data.  During
the writing phase, all the cassandra servers, one by one, started getting an
out-of-memory exception.  Here's the output from the first to die:

INFO [COMMIT-LOG-WRITER] 2010-06-10 14:18:54,609 CommitLog.java (line 407)
Discarding obsolete commit
log:CommitLogSegment(/var/lib/cassandra/commitlog/CommitLog-1276193883235.log)

INFO [ROW-MUTATION-STAGE:5] 2010-06-10 14:18:55,499 ColumnFamilyStore.java
(line 609) Enqueuing flush of Memtable(Standard1)@19571399

INFO [GMFD:1] 2010-06-10 14:19:01,556 Gossiper.java (line 568) 
InetAddress /10.210.69.221 is now UP
INFO [GMFD:1] 2010-06-10 14:20:35,136 Gossiper.java (line 568) 
InetAddress /10.254.242.228 is now UP
INFO [GMFD:1] 2010-06-10 14:20:35,137 Gossiper.java (line 568) 
InetAddress /10.201.207.129 is now UP
INFO [GMFD:1] 2010-06-10 14:20:36,922 Gossiper.java (line 568) 
InetAddress /10.198.37.241 is now UP

INFO [GC inspection] 2010-06-10 14:19:03,722 GCInspector.java (line 110) 
GC for ConcurrentMarkSweep: 2164 ms, 8754168 reclaimed leaving 1070909048 used;
max is 1174339584
INFO [GC inspection] 2010-06-10 14:21:09,068 GCInspector.java (line 110) GC for
ConcurrentMarkSweep: 2151 ms, 78896080 reclaimed leaving 994679752 used; max is
1174339584
INFO [Timer-1] 2010-06-10 14:21:09,068 Gossiper.java (line 179) 
InetAddress /10.198.37.241 is now dead.
INFO [Timer-1] 2010-06-10 14:21:12,045 Gossiper.java (line 179) 
InetAddress /10.210.69.221 is now dead.
 INFO [GMFD:1] 2010-06-10 14:21:12,046 Gossiper.java (line 568) 
InetAddress /10.210.203.210 is now UP
 INFO [GMFD:1] 2010-06-10 14:21:12,306 Gossiper.java (line 568) 
InetAddress /10.210.69.221 is now UP
 INFO [GMFD:1] 2010-06-10 14:21:12,306 Gossiper.java (line 568) 
InetAddress /10.192.218.117 is now UP
 INFO [GMFD:1] 2010-06-10 14:21:12,306 Gossiper.java (line 568) 
InetAddress /10.198.37.241 is now UP
 INFO [GMFD:1] 2010-06-10 14:21:12,307 Gossiper.java (line 568) 
InetAddress /10.254.138.226 is now UP
ERROR [ROW-MUTATION-STAGE:25] 2010-06-10 14:21:15,127 CassandraDaemon.java 
(line 78) Fatal exception in thread Thread[ROW-MUTATION-STAGE:25,5,main]
java.lang.OutOfMemoryError: Java heap space
        at
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:84)
        at
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:29)
        at
org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns
(ColumnFamilySerializer.java:117)
        at
org.apache.cassandra.db.ColumnFamilySerializer.deserialize
(ColumnFamilySerializer.java:108)
        at
org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps
(RowMutation.java:359)
        at
org.apache.cassandra.db.RowMutationSerializer.deserialize
(RowMutation.java:369)
        at
org.apache.cassandra.db.RowMutationSerializer.deserialize
(RowMutation.java:322)
        at
org.apache.cassandra.db.RowMutationVerbHandler.doVerb
(RowMutationVerbHandler.java:45)
        at
org.apache.cassandra.net.MessageDeliveryTask.run
(MessageDeliveryTask.java:40)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask
(ThreadPoolExecutor.java:886)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run
(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
ERROR [ROW-MUTATION-STAGE:18] 2010-06-10 14:21:15,129 CassandraDaemon.java 
(line 78) Fatal exception in thread Thread[ROW-MUTATION-STAGE:18,5,main]



Within 15 minutes, all 8 nodes died while my app continued trying to populate
the database.  Is there something I am doing wrong?  I am populating the
database very quickly by writing 100 rows at once in each of 8 clients, until
each client has written 100,000 rows.   All of my cassandra servers are started
up with 1GB of heap space:  /usr/bin/java -ea -Xms128M -Xmx1G …

Thank you for your help!
Julie

Reply via email to