[
https://issues.apache.org/jira/browse/CASSANDRA-18932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17776045#comment-17776045
]
Alex Petrov commented on CASSANDRA-18932:
-----------------------------------------
[~brandon.williams] sure, the attached sstables can be used to stably reproduce
the issue. I did try against
[https://github.com/apache/cassandra/commit/15be17ecef53adf575732fc8aa0f86eb1a774092]
and can confirm that at that commit this issue doesn't repro.
As soon as I have a Harry branch up, I can also share a dtest that reproduces
this in about 17 seconds, and we can try shrinking the commands, since we do
know which page/row it breaks on.
> Harry-found CorruptSSTableException / RT Closer issue when reading entire
> partition
> -----------------------------------------------------------------------------------
>
> Key: CASSANDRA-18932
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18932
> Project: Cassandra
> Issue Type: Bug
> Reporter: Alex Petrov
> Priority: Normal
> Attachments: node1_.zip, operation.log.zip
>
>
> While testing some new machinery for Harry, I have encountered a new RT
> closer / SSTable Corruption issue. I have grounds to believe this was
> introduced during the last year.
> Issue seems to happen because of intricate interleaving of flushes with
> writes and deletes.
> {code:java}
> ERROR [ReadStage-2] 2023-10-16 18:47:06,696 JVMStabilityInspector.java:76 -
> Exception in thread Thread[ReadStage-2,5,SharedPool]
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted:
> RandomAccessReader:BufferManagingRebufferer.Aligned:CompressedChunkReader.Mmap(/Users/ifesdjeen/foss/java/apache-cassandra-4.0/data/data1/harry/table_1-07c35a606c0a11eeae7a4f6ca489eb0c/nc-5-big-Data.db
> - LZ4Compressor, chunk length 16384, data length 232569)
> at
> org.apache.cassandra.io.sstable.AbstractSSTableIterator$AbstractReader.hasNext(AbstractSSTableIterator.java:381)
> at
> org.apache.cassandra.io.sstable.AbstractSSTableIterator.hasNext(AbstractSSTableIterator.java:242)
> at
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95)
> at
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
> at
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> at
> org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133)
> at
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:376)
> at
> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:188)
> at
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:157)
> at
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> at
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:534)
> at
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:402)
> at
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> at
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95)
> at
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
> at
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> at
> org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133)
> at
> org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133)
> at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:151)
> at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:101)
> at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:86)
> at
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:343)
> at
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:201)
> at
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:186)
> at
> org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:48)
> at
> org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:346)
> at
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:2186)
> at
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2581)
> at
> org.apache.cassandra.concurrent.ExecutionFailure$2.run(ExecutionFailure.java:163)
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143)
> at
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:829)
> Suppressed: java.lang.IllegalStateException: PROCESSED
> UnfilteredRowIterator for harry.table_1 (key:
> ZinzDdUuABgDknItABgDknItABgDknItXEFrgBnOmPmPylWrwXHqjBHgeQrGfnZd1124124583:ZinzDdUuABgDknItABgDknItABgDknItABgDknItABgDknItzHqchghqCXLhVYKM22215251:3.2758E-41
> omdt: [deletedAt=564416, localDeletion=1697450085]) has an illegal RT bounds
> sequence: expected all RTs to be closed, but the last one is open
> at
> org.apache.cassandra.db.transform.RTBoundValidator$RowsTransformation.ise(RTBoundValidator.java:117)
> at
> org.apache.cassandra.db.transform.RTBoundValidator$RowsTransformation.onPartitionClose(RTBoundValidator.java:112)
> at
> org.apache.cassandra.db.transform.BaseRows.runOnClose(BaseRows.java:91)
> at
> org.apache.cassandra.db.transform.BaseIterator.close(BaseIterator.java:95)
> at
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:341)
> ... 10 common frames omitted
> Caused by: java.io.IOException: Invalid Columns subset bytes; too many bits
> set:10
> at
> org.apache.cassandra.db.Columns$Serializer.deserializeSubset(Columns.java:578)
> at
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:604)
> at
> org.apache.cassandra.db.UnfilteredDeserializer.readNext(UnfilteredDeserializer.java:143)
> at
> org.apache.cassandra.io.sstable.format.big.SSTableIterator$ForwardIndexedReader.computeNext(SSTableIterator.java:175)
> at
> org.apache.cassandra.io.sstable.AbstractSSTableIterator$ForwardReader.hasNextInternal(AbstractSSTableIterator.java:533)
> at
> org.apache.cassandra.io.sstable.AbstractSSTableIterator$AbstractReader.hasNext(AbstractSSTableIterator.java:368)
> ... 31 common frames omitted {code}
>
> Unfortunately, harry branch is not ready for release yet. That said, I have a
> snapshot pinned for quick repro and can share SSTables that will easily repro
> the issue if there are any takers.
> To reproduce:
> {code:java}
> paging 5; # make sure to set page size to 5 (it breaks with other page sizes,
> too)
> select * from harry.table_1;
> {code}
> Please also make sure to modify snitch to set rack/dc:
> {code:java}
> + public static final String DATA_CENTER_NAME = "datacenter0";
> + public static final String RACK_NAME = "rack0";
> {code}
> And set directories in cassandra.yaml:
> {code:java}
> +data_file_directories:
> + - /your/path/data/data0
> + - /your/path/data/data1
> + -/your/path/data/data2{code}
> SHA for repro: 865d7c30e4755e74c4e4d26205a7aed4cfb55710
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]