[ https://issues.apache.org/jira/browse/FLINK-18592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Paul Lin updated FLINK-18592: ----------------------------- Priority: Major (was: Minor) > StreamingFileSink fails due to truncating HDFS file failure > ----------------------------------------------------------- > > Key: FLINK-18592 > URL: https://issues.apache.org/jira/browse/FLINK-18592 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem > Affects Versions: 1.10.1 > Reporter: JIAN WANG > Priority: Major > Labels: auto-deprioritized-major, pull-request-available > Fix For: 1.10.4, 1.14.0, 1.11.5 > > > I meet the issue on flink-1.10.1. I use flink on YARN(3.0.0-cdh6.3.2) with > StreamingFileSink. > code part like this: > {code} > public static <IN> StreamingFileSink<IN> build(String dir, > BucketAssigner<IN, String> assigner, String prefix) { > return StreamingFileSink.forRowFormat(new Path(dir), new > SimpleStringEncoder<IN>()) > .withRollingPolicy( > DefaultRollingPolicy > .builder() > > .withRolloverInterval(TimeUnit.HOURS.toMillis(2)) > > .withInactivityInterval(TimeUnit.MINUTES.toMillis(10)) > .withMaxPartSize(1024L * 1024L * 1024L > * 50) // Max 50GB > .build()) > .withBucketAssigner(assigner) > > .withOutputFileConfig(OutputFileConfig.builder().withPartPrefix(prefix).build()) > .build(); > } > {code} > The error is > {noformat} > java.io.IOException: > Problem while truncating file: > hdfs:///business_log/hashtag/2020-06-25/.hashtag-122-37.inprogress.8e65f69c-b5ba-4466-a844-ccc0a5a93de2 > {noformat} > Due to this issue, it can not restart from the latest checkpoint and > savepoint. > Currently, my workaround is that we keep latest 3 checkpoint, and if it > fails, I manually restart from penult checkpoint. -- This message was sent by Atlassian Jira (v8.3.4#803005)