[ https://issues.apache.org/jira/browse/FLINK-11499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17075847#comment-17075847 ]
Sivaprasanna Sethuraman commented on FLINK-11499: ------------------------------------------------- [~pnowojski] Yes, that's correct. Even the currently supported bulk formats have this situation. So yes, we should probably include this as a feature. > Extend StreamingFileSink BulkFormats to support arbitrary roll policies > ----------------------------------------------------------------------- > > Key: FLINK-11499 > URL: https://issues.apache.org/jira/browse/FLINK-11499 > Project: Flink > Issue Type: Improvement > Components: Connectors / FileSystem > Reporter: Seth Wiesman > Priority: Major > Labels: usability > Fix For: 1.11.0 > > > Currently when using the StreamingFilleSink Bulk-encoding formats can only be > combined with the `OnCheckpointRollingPolicy`, which rolls the in-progress > part file on every checkpoint. > However, many bulk formats such as parquet are most efficient when written as > large files; this is not possible when frequent checkpointing is enabled. > Currently the only work-around is to have long checkpoint intervals which is > not ideal. > > The StreamingFileSink should be enhanced to support arbitrary roll policy's > so users may write large bulk files while retaining frequent checkpoints. -- This message was sent by Atlassian Jira (v8.3.4#803005)