Yuan Kui created FLINK-32566:
--------------------------------

             Summary: HeapBytesVector meets java.lang.NegativeArraySizeException
                 Key: FLINK-32566
                 URL: https://issues.apache.org/jira/browse/FLINK-32566
             Project: Flink
          Issue Type: Bug
          Components: Table SQL / Runtime
    Affects Versions: 1.17.1
            Reporter: Yuan Kui


When reading the parquet format files, if the data of some fields is too large, 
HeapBytesVector will exceed the int maximum value when expending the capacity, 
and then it will meet the java.lang.NegativeArraySizeException.
{code:java}
// code placeholder
switched from RUNNING to FAILED with failure cause: java.lang.RuntimeException: 
One or more fetchers have encountered exception
        at 
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcherManager.checkErrors(SplitFetcherManager.java:233)
        at 
org.apache.flink.connector.base.source.reader.SourceReaderBase.getNextFetch(SourceReaderBase.java:168)
        at 
org.apache.flink.connector.base.source.reader.SourceReaderBase.pollNext(SourceReaderBase.java:129)
        at 
org.apache.flink.streaming.api.operators.SourceOperator.emitNext(SourceOperator.java:396)
        at 
org.apache.flink.streaming.runtime.io.StreamTaskSourceInput.emitNext(StreamTaskSourceInput.java:69)
        at 
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:66)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:426)
        at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:204)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:684)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.executeInvoke(StreamTask.java:639)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:650)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:623)
        at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:786)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: SplitFetcher thread 0 received 
unexpected exception while polling the records
        at 
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:163)
        at 
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:112)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        ... 1 more
Caused by: java.lang.NegativeArraySizeException
        at 
org.apache.flink.table.data.vector.heap.HeapBytesVector.reserve(HeapBytesVector.java:100)
        at 
org.apache.flink.table.data.vector.heap.HeapBytesVector.appendBytes(HeapBytesVector.java:77)
        at 
org.apache.flink.formats.parquet.vector.reader.BytesColumnReader.readBinary(BytesColumnReader.java:88)
        at 
org.apache.flink.formats.parquet.vector.reader.BytesColumnReader.readBatch(BytesColumnReader.java:50)
        at 
org.apache.flink.formats.parquet.vector.reader.BytesColumnReader.readBatch(BytesColumnReader.java:31)
        at 
org.apache.flink.formats.parquet.vector.reader.AbstractColumnReader.readToVector(AbstractColumnReader.java:189)
        at 
org.apache.flink.formats.parquet.ParquetVectorizedInputFormat$ParquetReader.nextBatch(ParquetVectorizedInputFormat.java:405)
        at 
org.apache.flink.formats.parquet.ParquetVectorizedInputFormat$ParquetReader.readBatch(ParquetVectorizedInputFormat.java:373)
        at 
org.apache.flink.connector.file.src.impl.FileSourceSplitReader.fetch(FileSourceSplitReader.java:71)
        at 
org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:56)
        at 
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:160)
        ... 6 more

{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to