Alex Smirnov created FLINK-9143: ----------------------------------- Summary: Restart strategy defined in flink-conf.yaml is ignored Key: FLINK-9143 URL: https://issues.apache.org/jira/browse/FLINK-9143 Project: Flink Issue Type: Bug Affects Versions: 1.4.2 Reporter: Alex Smirnov Attachments: execution_config.png, jobmanager.log, jobmanager.png
Restart strategy defined in flink-conf.yaml is disregarded, when user enables checkpointing. Steps to reproduce: 1. Download flink distribution (1.4.2), update flink-conf.yaml: restart-strategy: none state.backend: rocksdb state.backend.fs.checkpointdir: file:///tmp/nfsrecovery/flink-checkpoints-metadata state.backend.rocksdb.checkpointdir: file:///tmp/nfsrecovery/flink-checkpoints-rocksdb 2. create new java project as described at [https://ci.apache.org/projects/flink/flink-docs-release-1.4/quickstart/java_api_quickstart.html] here's the code: public class FailedJob { static final Logger LOGGER = LoggerFactory.getLogger(FailedJob.class); public static void main( String[] args ) throws Exception { final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.enableCheckpointing(5000, CheckpointingMode.EXACTLY_ONCE); DataStream<String> stream = env.fromCollection(Arrays.asList("test")); stream.map(new MapFunction<String, String>(){ @Override public String map(String obj) { throw new NullPointerException("NPE"); } }); env.execute("Failed job"); } } 3. Compile: mvn clean package; submit it to the cluster 4. Go to Job Manager configuration in WebUI, ensure settings from flink-conf.yaml is there (screenshot attached) 5. Go to Job's configuration, see Execution Configuration section *Expected result*: restart strategy as defined in flink-conf.yaml *Actual result*: Restart with fixed delay (10000 ms). #[2147483647|tel:(214)%20748-3647] restart attempts. see attached screenshots and jobmanager log (line 1 and 31) -- This message was sent by Atlassian JIRA (v7.6.3#76005)