I see. Then, I'll remove the HintsColumnFamily.

Because our cluster has a lot of data, running repair takes much time
(more than a day). And it's a kind of pain. It often causes disk full,
creates many sstables and degrades read performance.
If it's easy to fix the hint, it could be less painful solution. But I
understand there's no other option in this case.


Thanks,
Shotaro


On Sun, Mar 27, 2011 at 11:51 PM, Jonathan Ellis <jbel...@gmail.com> wrote:
> Why would you try to repair hints?
>
> If you run repair on the non-system data then you don't need the hint
> data and can remove it.
>
> On Sun, Mar 27, 2011 at 12:17 AM, Shotaro Kamio <kamios...@gmail.com> wrote:
>> Hi,
>>
>> Our cluster uses cassandra 0.7.4 (upgraded from 0.7.3) with
>> replication = 3. I found that error occurs on one node during hinted
>> handoff with following error (log #1 below).
>> When I tried out "scrub system HintsColumnFamily", I saw an ERROR in
>> log (log #2 below).
>> Do you think these errors are critical ?
>> I tried to "repair system HintsColumnFamily". But, it refuses to run
>> with "No neighbors". I can understand because hints are not
>> replicated. But then, is there any way to fix it without data loss?
>>
>>  INFO [manual-repair-0996a2ec-26d3-4243-9586-d56daf30f9bd] 2011-03-27
>> 13:55:05,664 AntiEntropyService.java (line 752) No neighbors to repair
>> with: manual-repair-0996a2ec-26d3-4243-9586-d56daf30f9bd completed.
>>
>>
>> Best regards,
>> Shotaro
>>
>>
>> ---------------- Log #1: Error on hinted handoff
>> ------------------------------------------------
>>
>> ERROR [HintedHandoff:1] 2011-03-26 20:04:22,528
>> DebuggableThreadPoolExecutor.java (line 103) Error in
>> ThreadPoolExecutor
>> java.lang.RuntimeException: java.lang.RuntimeException: error reading
>> 4976040 of 4976067
>>        at 
>> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>>        at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>        at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>        at java.lang.Thread.run(Thread.java:662)
>> Caused by: java.lang.RuntimeException: error reading 4976040 of 4976067
>>        at 
>> org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:83)
>>        at 
>> org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:40)
>>        at 
>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
>>        at 
>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
>>        at 
>> org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:108)
>>        at 
>> org.apache.commons.collections.iterators.CollatingIterator.anyHasNext(CollatingIterator.java:364)
>>        at 
>> org.apache.commons.collections.iterators.CollatingIterator.hasNext(CollatingIterator.java:217)
>>        at 
>> org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:63)
>>        at 
>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
>>        at 
>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
>>        at 
>> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:116)
>>        at 
>> org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(QueryFilter.java:130)
>>        at 
>> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1368)
>>        at 
>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1245)
>>        at 
>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1173)
>>        at 
>> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:321)
>>        at 
>> org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:88)
>>        at 
>> org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:409)
>>        at 
>> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>>        ... 3 more
>> Caused by: java.io.EOFException
>>        at java.io.RandomAccessFile.readByte(RandomAccessFile.java:591)
>>        at 
>> org.apache.cassandra.utils.ByteBufferUtil.readShortLength(ByteBufferUtil.java:324)
>>        at 
>> org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:335)
>>        at 
>> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:351)
>>        at 
>> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:311)
>>        at 
>> org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:79)
>>        ... 21 more
>>
>> ------------------------------------------
>>
>> ---- Log #2: Error on scrub -------------------
>>
>>  INFO [CompactionExecutor:1] 2011-03-27 08:07:34,527
>> CompactionManager.java (line 512) Scrubbing
>> SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-530-Data.db')
>>  WARN [CompactionExecutor:1] 2011-03-27 08:07:34,602
>> CompactionManager.java (line 607) Non-fatal error reading row
>> (stacktrace follows)
>> java.io.IOError: java.io.IOException: Impossible row size 406136901
>>        at 
>> org.apache.cassandra.db.CompactionManager.doScrub(CompactionManager.java:589)
>>        at 
>> org.apache.cassandra.db.CompactionManager.access$600(CompactionManager.java:56)
>>        at 
>> org.apache.cassandra.db.CompactionManager$3.call(CompactionManager.java:195)
>>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>        at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>        at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>        at java.lang.Thread.run(Thread.java:662)
>> Caused by: java.io.IOException: Impossible row size 406136901
>>        ... 8 more
>>  INFO [CompactionExecutor:1] 2011-03-27 08:07:34,602
>> CompactionManager.java (line 613) Retrying from row index; data is
>> 406134739 bytes starting at 21
>>  INFO [CompactionExecutor:1] 2011-03-27 08:08:10,065
>> CompactionManager.java (line 653) Scrub of
>> SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-530-Data.db')
>> complete: 1 rows in new sstable and 0 empty (tombstoned) rows dropped
>>  INFO [CompactionExecutor:1] 2011-03-27 08:08:10,065
>> CompactionManager.java (line 512) Scrubbing
>> SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-531-Data.db')
>>  INFO [CompactionExecutor:1] 2011-03-27 08:08:10,145
>> CompactionManager.java (line 653) Scrub of
>> SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-531-Data.db')
>> complete: 1 rows in new sstable and 0 empty (tombstoned) rows dropped
>>  INFO [CompactionExecutor:1] 2011-03-27 08:08:10,145
>> CompactionManager.java (line 512) Scrubbing
>> SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-532-Data.db')
>>  INFO [CompactionExecutor:1] 2011-03-27 08:08:10,363
>> CompactionManager.java (line 653) Scrub of
>> SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-532-Data.db')
>> complete: 1 rows in new sstable and 0 empty (tombstoned) rows dropped
>>  INFO [CompactionExecutor:1] 2011-03-27 08:08:10,364
>> CompactionManager.java (line 512) Scrubbing
>> SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-533-Data.db')
>>  INFO [CompactionExecutor:1] 2011-03-27 08:08:10,540
>> CompactionManager.java (line 653) Scrub of
>> SSTableReader(path='/data/cassandra/system/HintsColumnFamily-f-533-Data.db')
>> complete: 1 rows in new sstable and 0 empty (tombstoned) rows dropped
>> --------------------------------------------------------
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>



-- 
Shotaro Kamio

Reply via email to