[ 
https://issues.apache.org/jira/browse/FLINK-11318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16742965#comment-16742965
 ] 

Edward Rojas commented on FLINK-11318:
--------------------------------------

Yes [~kkl0u], I did verify that this could happen. This in the scenario I 
mentioned in the description, when trying to migrate a job using BucketingSink 
to now use StreamingFileSink, as I would like to use the same buckets that in 
the previous version of the job, finished part files are overwritten (I'm using 
HDFS).

> [Regression] StreamingFileSink can overwrite existing files
> -----------------------------------------------------------
>
>                 Key: FLINK-11318
>                 URL: https://issues.apache.org/jira/browse/FLINK-11318
>             Project: Flink
>          Issue Type: Bug
>          Components: filesystem-connector
>    Affects Versions: 1.6.3, 1.7.1
>            Reporter: Edward Rojas
>            Assignee: Kostas Kloudas
>            Priority: Major
>
> StreamingFileSink does not validate if a file with the same name of the new 
> part file already exists and this could result in overwriting a file.
> The BucketingSink perform this kind of validations in the "openNewPartFile" 
> method here: 
> https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-filesystem/src/main/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.java#L549-L561
> So this seems to be a regression an in the "old" BucketingSink this works.
>  
> This can be problematic for example when migrating a job using Bucketing to 
> to use the StreamingFileSink, file could be overwritten.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to