A patch for the (rare) problem I've found just by accidentally reading
the code.
It very likely fixes this:
I haven't had time to look into it yet, but just wanted to let you guys
know that I hit this in case someone was in that code.
ERROR 14:07:31,215 Fatal exception in thread
Thread[COMMIT-LOG-WRITER,5,main]
java.nio.BufferOverflowException
at java.nio.Buffer.nextPutIndex(Buffer.java:501)
at java.nio.DirectByteBuffer.putInt(DirectByteBuffer.java:654)
at
org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:259)
at
org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:568)
at
org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at java.lang.Thread.run(Thread.java:662)
INFO 14:07:31,504 flushing high-traffic column family CFS(Keyspace='***',
ColumnFamily='***') (estimated 103394287 bytes)
It happened during a fairly standard load process using M/R.
After that, the server refused to come down with a standard kill.
--
Piotr Kołaczkowski
Instytut Informatyki, Politechnika Warszawska
Nowowiejska 15/19, 00-665 Warszawa
e-mail: pkola...@ii.pw.edu.pl
www: http://home.elka.pw.edu.pl/~pkolaczk/
diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java
index bcd13fd..1351271 100644
--- a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java
+++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java
@@ -56,8 +56,8 @@ public class CommitLogSegment
private static final String FILENAME_EXTENSION = ".log";
private static Pattern COMMIT_LOG_FILE_PATTERN = Pattern.compile(FILENAME_PREFIX + "(\\d+)" + FILENAME_EXTENSION);
- // The commit log entry overhead in bytes (int: length + long: head checksum + long: tail checksum)
- static final int ENTRY_OVERHEAD_SIZE = 4 + 8 + 8;
+ // The commit log entry overhead in bytes (int: length + long: head checksum + long: tail checksum + int: end of segment marker)
+ static final int ENTRY_OVERHEAD_SIZE = 4 + 8 + 8 + 4;
// cache which cf is dirty in this segment to avoid having to lookup all ReplayPositions to decide if we can delete this segment
private final HashMap<Integer, Integer> cfLastWrite = new HashMap<Integer, Integer>();