Repair in Cassandra 0.8.4 taking too long

Raj N Sat, 01 Oct 2011 10:24:33 -0700

I had 3 nodes with strategy_options (DC1=3) in 1 DC. I added 1 more DC and 3
more nodes. I didnt set the initial token. But I ran nodetool move on the
new nodes(adding 1 to the tokens of the nodes in DC1) . I updated the
keyspace to strategy_options (DC1=3, DC2=3). Then I started running nodetool
repair on each of the nodes. Before I started repair each node had around 5
GB of data. I started on the new nodes. 2 of the nodes completed the repair
in 2 hours each. During the repair I saw the data to grow to almost 25 GB,
but eventually when the repair was done the data settled at around 9 GB. Is
this normal? The 3rd node has been running repair for a long time. It
eventually stopped throwing an exception -
Exception in thread "main" java.rmi.UnmarshalException: Error unmarshaling
return header; nested exception is:
        java.io.EOFException
        at
sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:209)
        at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:142)
        at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
        at javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown
Source)
        at
javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:993)
        at
javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:288)
        at $Proxy0.forceTableRepair(Unknown Source)
        at
org.apache.cassandra.tools.NodeProbe.forceTableRepair(NodeProbe.java:192)
        at
org.apache.cassandra.tools.NodeCmd.optionalKSandCFs(NodeCmd.java:773)
        at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:669)
Caused by: java.io.EOFException
        at java.io.DataInputStream.readByte(DataInputStream.java:250)
        at
sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:195)


I started repair again since its safe to do so. Now the GCInspector
complains of not enough heap -
WARN [ScheduledTasks:1] 2011-10-01 13:08:16,227 GCInspector.java (line 149)
Heap is 0.7598414264960864 full.  You may need to reduce memtable and/or
cache sizes.  Cassandra will now flush up to the two largest memtables to
free up memory.  Adjust flush_largest_memtables_at threshold in
cassandra.yaml if you don't want Cassandra to do this automatically
 INFO [ScheduledTasks:1] 2011-10-01 13:08:16,227 StorageService.java (line
2398) Unable to reduce heap usage since there are no dirty column families

nodetool ring shows 48GB of data on the node.

My Xmx is 2G. I rely on OS caching more than Row caching or key caching.
Hence the column families are created with default settings.

Any help would be appreciated.

Thanks
-Raj

Repair in Cassandra 0.8.4 taking too long

Reply via email to