I see. Then, I'll remove the HintsColumnFamily. Because our cluster has a lot of data, running repair takes much time (more than a day). And it's a kind of pain. It often causes disk full, creates many sstables and degrades read performance. If it's easy to fix the hint, it could be less painful solution. But I understand there's no other option in this case.
Thanks, Shotaro On Sun, Mar 27, 2011 at 11:51 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > Why would you try to repair hints? > > If you run repair on the non-system data then you don't need the hint > data and can remove it. > > On Sun, Mar 27, 2011 at 12:17 AM, Shotaro Kamio <kamios...@gmail.com> wrote: >> Hi, >> >> Our cluster uses cassandra 0.7.4 (upgraded from 0.7.3) with >> replication = 3. I found that error occurs on one node during hinted >> handoff with following error (log #1 below). >> When I tried out "scrub system HintsColumnFamily", I saw an ERROR in >> log (log #2 below). >> Do you think these errors are critical ? >> I tried to "repair system HintsColumnFamily". But, it refuses to run >> with "No neighbors". I can understand because hints are not >> replicated. But then, is there any way to fix it without data loss? >> >> INFO [manual-repair-0996a2ec-26d3-4243-9586-d56daf30f9bd] 2011-03-27 >> 13:55:05,664 AntiEntropyService.java (line 752) No neighbors to repair >> with: manual-repair-0996a2ec-26d3-4243-9586-d56daf30f9bd completed. >> >> >> Best regards, >> Shotaro >> >> >> ---------------- Log #1: Error on hinted handoff >> ------------------------------------------------ >> >> ERROR [HintedHandoff:1] 2011-03-26 20:04:22,528 >> DebuggableThreadPoolExecutor.java (line 103) Error in >> ThreadPoolExecutor >> java.lang.RuntimeException: java.lang.RuntimeException: error reading >> 4976040 of 4976067 >> at >> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> at java.lang.Thread.run(Thread.java:662) >> Caused by: java.lang.RuntimeException: error reading 4976040 of 4976067 >> at >> org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:83) >> at >> org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:40) >> at >> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) >> at >> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) >> at >> org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:108) >> at >> org.apache.commons.collections.iterators.CollatingIterator.anyHasNext(CollatingIterator.java:364) >> at >> org.apache.commons.collections.iterators.CollatingIterator.hasNext(CollatingIterator.java:217) >> at >> org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:63) >> at >> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) >> at >> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) >> at >> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:116) >> at >> org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(QueryFilter.java:130) >> at >> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1368) >> at >> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1245) >> at >> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1173) >> at >> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:321) >> at >> org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:88) >> at >> org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:409) >> at >> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) >> ... 3 more >> Caused by: java.io.EOFException >> at java.io.RandomAccessFile.readByte(RandomAccessFile.java:591) >> at >> org.apache.cassandra.utils.ByteBufferUtil.readShortLength(ByteBufferUtil.java:324) >> at >> org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:335) >> at >> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:351) >> at >> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:311) >> at >> org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:79) >> ... 21 more >> >> ------------------------------------------ >> >> ---- Log #2: Error on scrub ------------------- >> >> INFO [CompactionExecutor:1] 2011-03-27 08:07:34,527 >> CompactionManager.java (line 512) Scrubbing >> SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-530-Data.db') >> WARN [CompactionExecutor:1] 2011-03-27 08:07:34,602 >> CompactionManager.java (line 607) Non-fatal error reading row >> (stacktrace follows) >> java.io.IOError: java.io.IOException: Impossible row size 406136901 >> at >> org.apache.cassandra.db.CompactionManager.doScrub(CompactionManager.java:589) >> at >> org.apache.cassandra.db.CompactionManager.access$600(CompactionManager.java:56) >> at >> org.apache.cassandra.db.CompactionManager$3.call(CompactionManager.java:195) >> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >> at java.util.concurrent.FutureTask.run(FutureTask.java:138) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> at java.lang.Thread.run(Thread.java:662) >> Caused by: java.io.IOException: Impossible row size 406136901 >> ... 8 more >> INFO [CompactionExecutor:1] 2011-03-27 08:07:34,602 >> CompactionManager.java (line 613) Retrying from row index; data is >> 406134739 bytes starting at 21 >> INFO [CompactionExecutor:1] 2011-03-27 08:08:10,065 >> CompactionManager.java (line 653) Scrub of >> SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-530-Data.db') >> complete: 1 rows in new sstable and 0 empty (tombstoned) rows dropped >> INFO [CompactionExecutor:1] 2011-03-27 08:08:10,065 >> CompactionManager.java (line 512) Scrubbing >> SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-531-Data.db') >> INFO [CompactionExecutor:1] 2011-03-27 08:08:10,145 >> CompactionManager.java (line 653) Scrub of >> SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-531-Data.db') >> complete: 1 rows in new sstable and 0 empty (tombstoned) rows dropped >> INFO [CompactionExecutor:1] 2011-03-27 08:08:10,145 >> CompactionManager.java (line 512) Scrubbing >> SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-532-Data.db') >> INFO [CompactionExecutor:1] 2011-03-27 08:08:10,363 >> CompactionManager.java (line 653) Scrub of >> SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-532-Data.db') >> complete: 1 rows in new sstable and 0 empty (tombstoned) rows dropped >> INFO [CompactionExecutor:1] 2011-03-27 08:08:10,364 >> CompactionManager.java (line 512) Scrubbing >> SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-533-Data.db') >> INFO [CompactionExecutor:1] 2011-03-27 08:08:10,540 >> CompactionManager.java (line 653) Scrub of >> SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-533-Data.db') >> complete: 1 rows in new sstable and 0 empty (tombstoned) rows dropped >> -------------------------------------------------------- >> > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com > -- Shotaro Kamio