Hi, Our cluster uses cassandra 0.7.4 (upgraded from 0.7.3) with replication = 3. I found that error occurs on one node during hinted handoff with following error (log #1 below). When I tried out "scrub system HintsColumnFamily", I saw an ERROR in log (log #2 below). Do you think these errors are critical ? I tried to "repair system HintsColumnFamily". But, it refuses to run with "No neighbors". I can understand because hints are not replicated. But then, is there any way to fix it without data loss?
INFO [manual-repair-0996a2ec-26d3-4243-9586-d56daf30f9bd] 2011-03-27 13:55:05,664 AntiEntropyService.java (line 752) No neighbors to repair with: manual-repair-0996a2ec-26d3-4243-9586-d56daf30f9bd completed. Best regards, Shotaro ---------------- Log #1: Error on hinted handoff ------------------------------------------------ ERROR [HintedHandoff:1] 2011-03-26 20:04:22,528 DebuggableThreadPoolExecutor.java (line 103) Error in ThreadPoolExecutor java.lang.RuntimeException: java.lang.RuntimeException: error reading 4976040 of 4976067 at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.RuntimeException: error reading 4976040 of 4976067 at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:83) at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:40) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) at org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:108) at org.apache.commons.collections.iterators.CollatingIterator.anyHasNext(CollatingIterator.java:364) at org.apache.commons.collections.iterators.CollatingIterator.hasNext(CollatingIterator.java:217) at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:63) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:116) at org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(QueryFilter.java:130) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1368) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1245) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1173) at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:321) at org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:88) at org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:409) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more Caused by: java.io.EOFException at java.io.RandomAccessFile.readByte(RandomAccessFile.java:591) at org.apache.cassandra.utils.ByteBufferUtil.readShortLength(ByteBufferUtil.java:324) at org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:335) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:351) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:311) at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:79) ... 21 more ------------------------------------------ ---- Log #2: Error on scrub ------------------- INFO [CompactionExecutor:1] 2011-03-27 08:07:34,527 CompactionManager.java (line 512) Scrubbing SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-530-Data.db') WARN [CompactionExecutor:1] 2011-03-27 08:07:34,602 CompactionManager.java (line 607) Non-fatal error reading row (stacktrace follows) java.io.IOError: java.io.IOException: Impossible row size 406136901 at org.apache.cassandra.db.CompactionManager.doScrub(CompactionManager.java:589) at org.apache.cassandra.db.CompactionManager.access$600(CompactionManager.java:56) at org.apache.cassandra.db.CompactionManager$3.call(CompactionManager.java:195) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Impossible row size 406136901 ... 8 more INFO [CompactionExecutor:1] 2011-03-27 08:07:34,602 CompactionManager.java (line 613) Retrying from row index; data is 406134739 bytes starting at 21 INFO [CompactionExecutor:1] 2011-03-27 08:08:10,065 CompactionManager.java (line 653) Scrub of SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-530-Data.db') complete: 1 rows in new sstable and 0 empty (tombstoned) rows dropped INFO [CompactionExecutor:1] 2011-03-27 08:08:10,065 CompactionManager.java (line 512) Scrubbing SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-531-Data.db') INFO [CompactionExecutor:1] 2011-03-27 08:08:10,145 CompactionManager.java (line 653) Scrub of SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-531-Data.db') complete: 1 rows in new sstable and 0 empty (tombstoned) rows dropped INFO [CompactionExecutor:1] 2011-03-27 08:08:10,145 CompactionManager.java (line 512) Scrubbing SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-532-Data.db') INFO [CompactionExecutor:1] 2011-03-27 08:08:10,363 CompactionManager.java (line 653) Scrub of SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-532-Data.db') complete: 1 rows in new sstable and 0 empty (tombstoned) rows dropped INFO [CompactionExecutor:1] 2011-03-27 08:08:10,364 CompactionManager.java (line 512) Scrubbing SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-533-Data.db') INFO [CompactionExecutor:1] 2011-03-27 08:08:10,540 CompactionManager.java (line 653) Scrub of SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-533-Data.db') complete: 1 rows in new sstable and 0 empty (tombstoned) rows dropped --------------------------------------------------------