[ https://issues.apache.org/jira/browse/FLINK-35150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
qyw updated FLINK-35150: ------------------------ Attachment: (was: image-2024-04-18-11-20-43-126.png) > The specified upload does not exist. The upload ID may be invalid > ----------------------------------------------------------------- > > Key: FLINK-35150 > URL: https://issues.apache.org/jira/browse/FLINK-35150 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem > Affects Versions: 1.15.0 > Reporter: qyw > Priority: Major > Attachments: image-2024-04-18-10-51-05-071.png, > image-2024-04-18-11-03-08-998.png, image-2024-04-18-11-20-25-583.png > > > Flink S3 hadoop, write S3 in csv mode, I used this patch FLINK-28513 . But > I don't understand why S3RecoverableFsDataOutputStream "sync" method of this > class to be "completeMultipartUpload" operation, if "completeMultipartUpload" > here, Calling close later to upload the rest of the stream will inevitably > result in an error. The part corresponding to uploadID has been merged. > Therefore, when the message in csv is larger than > "S3_MULTIPART_MIN_PART_SIZE", the uploadPart will be started when switching > files, then when BulkPartWriter performs closeForCommit, Due to the sync > S3RecoverableFsDataOutputStream method call completeMultipartUpload, So > S3RecoverableFsDataOutputStream "closeForCommit" method due to the > uploadPart, at this time will lead to errors. > > BulkPartWriter: > !image-2024-04-18-11-03-08-998.png! > CsvBulkWriter: > !image-2024-04-18-11-20-43-126.png! > S3RecoverableFsDataOutputStream: > !image-2024-04-18-10-51-05-071.png! > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)