Hi, Rahul. I've run nodetool upgradesstable only in the problematic CF. It throwed the following exception:
Error occurred while upgrading the sstables for keyspace Sessions java.util.concurrent.ExecutionException: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: dataSize of 3622081913630118729 starting at 32906 would be larger than file /mnt/cassandra/data/Sessions/Users/Sessions-Users-ib-2516-Data.db length 1038 893416 at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:271) at org.apache.cassandra.db.compaction.CompactionManager.performSSTableRewrite(CompactionManager.java:287) at org.apache.cassandra.db.ColumnFamilyStore.sstablesRewrite(ColumnFamilyStore.java:977) at org.apache.cassandra.service.StorageService.upgradeSSTables(StorageService.java:2191) … … Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: dataSize of 3622081913630118729 starting at 32906 would be larger than file /mnt/cassandra/data/Sessions/Users/Sessions-Users-ib-2516-Data.db length 1038893416 at org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:167) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:83) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:69) at org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:180) at org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:155) at org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:142) at org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:38) at org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:202) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:134) at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58) at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60) at org.apache.cassandra.db.compaction.CompactionManager$4.perform(CompactionManager.java:301) at org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:250) at java.util.concurrent.FutureTask.run(FutureTask.java:262) ... 3 more Caused by: java.io.IOException: dataSize of 3622081913630118729 starting at 32906 would be larger than file /mnt/cassandra/data/Sessions/Users/Sessions-Users-ib-2516-Data.db length 1038893416 at org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:123) ... 20 more Regards, Francisco On Jan 29, 2014, at 3:38 PM, Rahul Menon <ra...@apigee.com> wrote: > Francisco, > > the sstables with *-ib-* is something that was from a previous version of c*. > The *-ib-* naming convention started at c* 1.2.1 but 1.2.10 onwards im sure > it has the *-ic-* convention. You could try running a nodetool sstableupgrade > which should ideally upgrade the sstables with the *-ib-* to *-ic-*. > > Rahul > > On Wed, Jan 29, 2014 at 12:55 AM, Francisco Nogueira Calmon Sobral > <fsob...@igcorp.com.br> wrote: > Dear experts, > > We are facing a annoying problem in our cluster. > > We have 9 amazon extra large linux nodes, running Cassandra 1.2.11. > > The short story is that after moving the data from one cluster to another, > we've been unable to run 'nodetool repair'. It get stuck due to a > CorruptSSTableException in some nodes and CFs. After looking at some > problematic CFs, we observed that some of them have root permissions, instead > of cassandra permissions. Also, their names are different from the 'good' > ones as we can see below: > > BAD > ------ > -rw-r--r-- 8 cassandra cassandra 991M Nov 8 15:11 > Sessions-Users-ib-2516-Data.db > -rw-r--r-- 8 cassandra cassandra 703M Nov 8 15:11 > Sessions-Users-ib-2516-Index.db > -rw-r--r-- 8 cassandra cassandra 5.3M Nov 13 11:42 > Sessions-Users-ib-2516-Summary.db > > GOOD > --------- > -rw-r--r-- 1 cassandra cassandra 22K Jan 15 10:50 > Sessions-Users-ic-2933-CompressionInfo.db > -rw-r--r-- 1 cassandra cassandra 106M Jan 15 10:50 > Sessions-Users-ic-2933-Data.db > -rw-r--r-- 1 cassandra cassandra 2.2M Jan 15 10:50 > Sessions-Users-ic-2933-Filter.db > -rw-r--r-- 1 cassandra cassandra 76M Jan 15 10:50 > Sessions-Users-ic-2933-Index.db > -rw-r--r-- 1 cassandra cassandra 4.3K Jan 15 10:50 > Sessions-Users-ic-2933-Statistics.db > -rw-r--r-- 1 cassandra cassandra 574K Jan 15 10:50 > Sessions-Users-ic-2933-Summary.db > -rw-r--r-- 1 cassandra cassandra 79 Jan 15 10:50 > Sessions-Users-ic-2933-TOC.txt > > > We changed the permissions back to 'cassandra' and ran 'nodetool scrub' in > this problematic CF, but it has been running for at least two weeks (it is > not frozen) and keeps logging many WARNs while working with the above > mentioned SSTable: > > WARN [CompactionExecutor:15] 2014-01-28 17:01:22,571 OutputHandler.java (line > 57) Non-fatal error reading row (stacktrace follows) > java.io.IOError: java.io.IOException: Impossible row size 3618452438597849419 > at > org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:171) > at > org.apache.cassandra.db.compaction.CompactionManager.scrubOne(CompactionManager.java:526) > at > org.apache.cassandra.db.compaction.CompactionManager.doScrub(CompactionManager.java:515) > at > org.apache.cassandra.db.compaction.CompactionManager.access$400(CompactionManager.java:70) > at > org.apache.cassandra.db.compaction.CompactionManager$3.perform(CompactionManager.java:280) > at > org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:250) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Impossible row size 3618452438597849419 > ... 10 more > > > 1) I do not think that deleting all data of one node and running 'nodetool > rebuild' will work, since we observed that this problem occurs in all nodes. > So we may not be able to restore all the data. What can be done in this case? > > 2) Why the permissions of some sstables are 'root'? Is this problem caused by > our manual migration of data? (see long story below) > > > How we ran into this? > > The long story is that we've tried to move our cluster with sstableloader, > but it was unable to load all the data correctly. Our solution was to put ALL > cluster data into EACH new node and run 'nodetool refresh'. I performed this > task for each node and each column family sequentially. Sometimes I had to > rename some sstables, because they came from different nodes with the same > name. I don't remember if I ran 'nodetool repair' or even 'nodetool cleanup' > in each node. Apparently, the process was successful, and (almost) all the > data was moved. > > Unfortunately, after 3 months since we moved, I am unable to perform read > operations in some keys of some CFs. I think that some of these keys belong > to the above mentioned sstables. > > Any insights are welcome. > > Best regards, > Francisco Sobral >