[ https://issues.apache.org/jira/browse/FLINK-36517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Leonard Xu updated FLINK-36517: ------------------------------- Fix Version/s: cdc-3.2.1 > Duplicate commit the same datafile in Paimon Sink > ------------------------------------------------- > > Key: FLINK-36517 > URL: https://issues.apache.org/jira/browse/FLINK-36517 > Project: Flink > Issue Type: Bug > Components: Flink CDC > Affects Versions: cdc-3.2.0 > Environment: flink-1.18.0 > flink cdc 3.2.0 > Reporter: JunboWang > Priority: Critical > Labels: pull-request-available > Fix For: cdc-3.2.1 > > > Same as https://issues.apache.org/jira/browse/FLINK-35938. > When job restart from failure. This may cause the same datafile were added > twice in PaimonCommitter. > > {code:java} > // code placeholder > 2024-09-26 19:50:04,623 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Sink > Writer: cupid_dev (1/1) > (44d8fba307e7ff3403a94f71507d4523_26351f8267c5887c12c827914f3a91a9_0_1) > switched from RUNNING to FAILED on > container_e47_1710209318993_56454712_01_000002 @ xxxxxxxxxx > (dataPort=41512).java.io.IOException: java.lang.IllegalStateException: Trying > to add file {org.apache.paimon.data.BinaryRow@9c67b85d, 6, 0, > data-f6d7562e-7995-464e-909d-aff0c19b244b-359.parquet} which is already > added. at > org.apache.flink.cdc.connectors.paimon.sink.v2.PaimonWriter.write(PaimonWriter.java:138) > ~[flink-cdc-pipeline-connector-paimon.jar:3.2.0] at > org.apache.flink.streaming.runtime.operators.sink.SinkWriterOperator.processElement(SinkWriterOperator.java:161) > ~[flink-dist-1.18.0.jar:1.18.0] at > org.apache.flink.cdc.runtime.operators.sink.DataSinkWriterOperator.processElement(DataSinkWriterOperator.java:177) > ~[flink-cdc-dist.jar:3.2.0] at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:237) > ~[flink-dist-1.18.0.jar:1.18.0] at > org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:146) > ~[flink-dist-1.18.0.jar:1.18.0] at > org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:110) > ~[flink-dist-1.18.0.jar:1.18.0] at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65) > ~[flink-dist-1.18.0.jar:1.18.0] at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:562) > ~[flink-dist-1.18.0.jar:1.18.0] at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231) > ~[flink-dist-1.18.0.jar:1.18.0] at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:858) > ~[flink-dist-1.18.0.jar:1.18.0] at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:807) > ~[flink-dist-1.18.0.jar:1.18.0] at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:971) > ~[flink-dist-1.18.0.jar:1.18.0] at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:950) > ~[flink-dist-1.18.0.jar:1.18.0] at > org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:764) > ~[flink-dist-1.18.0.jar:1.18.0] at > org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) > ~[flink-dist-1.18.0.jar:1.18.0] at > java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_202]Caused by: > java.lang.IllegalStateException: Trying to add file > {org.apache.paimon.data.BinaryRow@9c67b85d, 6, 0, > data-f6d7562e-7995-464e-909d-aff0c19b244b-359.parquet} which is already > added. at > org.apache.paimon.utils.Preconditions.checkState(Preconditions.java:204) > ~[flink-cdc-pipeline-connector-paimon.jar:3.2.0] at > org.apache.paimon.manifest.FileEntry.mergeEntries(FileEntry.java:130) > ~[flink-cdc-pipeline-connector-paimon.jar:3.2.0] at > org.apache.paimon.manifest.FileEntry.mergeEntries(FileEntry.java:112) > ~[flink-cdc-pipeline-connector-paimon.jar:3.2.0] at > org.apache.paimon.operation.AbstractFileStoreScan.readAndMergeFileEntries(AbstractFileStoreScan.java:391) > ~[flink-cdc-pipeline-connector-paimon.jar:3.2.0] at > org.apache.paimon.operation.AbstractFileStoreScan.doPlan(AbstractFileStoreScan.java:295) > ~[flink-cdc-pipeline-connector-paimon.jar:3.2.0] at > org.apache.paimon.operation.AbstractFileStoreScan.plan(AbstractFileStoreScan.java:219) > ~[flink-cdc-pipeline-connector-paimon.jar:3.2.0] at > org.apache.paimon.operation.AbstractFileStoreWrite.scanExistingFileMetas(AbstractFileStoreWrite.java:425) > ~[flink-cdc-pipeline-connector-paimon.jar:3.2.0] at > org.apache.paimon.operation.AbstractFileStoreWrite.createWriterContainer(AbstractFileStoreWrite.java:384) > ~[flink-cdc-pipeline-connector-paimon.jar:3.2.0] at > org.apache.paimon.operation.AbstractFileStoreWrite.lambda$getWriterWrapper$2(AbstractFileStoreWrite.java:359) > ~[flink-cdc-pipeline-connector-paimon.jar:3.2.0] at > java.util.HashMap.computeIfAbsent(HashMap.java:1127) ~[?:1.8.0_202] at > org.apache.paimon.operation.AbstractFileStoreWrite.getWriterWrapper(AbstractFileStoreWrite.java:358) > ~[flink-cdc-pipeline-connector-paimon.jar:3.2.0] at > org.apache.paimon.operation.AbstractFileStoreWrite.write(AbstractFileStoreWrite.java:127) > ~[flink-cdc-pipeline-connector-paimon.jar:3.2.0] at > org.apache.paimon.table.sink.TableWriteImpl.writeAndReturn(TableWriteImpl.java:151) > ~[flink-cdc-pipeline-connector-paimon.jar:3.2.0] at > org.apache.flink.cdc.connectors.paimon.sink.v2.StoreSinkWriteImpl.write(StoreSinkWriteImpl.java:151) > ~[flink-cdc-pipeline-connector-paimon.jar:3.2.0] at > org.apache.flink.cdc.connectors.paimon.sink.v2.PaimonWriter.write(PaimonWriter.java:136) > ~[flink-cdc-pipeline-connector-paimon.jar:3.2.0] ... 15 more{code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)