Elias Levy created FLINK-9450: --------------------------------- Summary: Job hangs if S3 access it denied during checkpoints Key: FLINK-9450 URL: https://issues.apache.org/jira/browse/FLINK-9450 Project: Flink Issue Type: Bug Components: State Backends, Checkpointing Affects Versions: 1.4.2 Reporter: Elias Levy
We have a streaming job that consumes from and writes to Kafka. The job is configured to checkpoint to S3. If we deny access to S3 by using iptables on the TM host to deny all outgoing connections to ports 80 and 443, whether using DROP or REJECT, and whether using REJECT with -reject-with tcp-reset or -r reject-with imp-port-unreachable, the job soon stops publishing to Kafka. This happens whether or not the Kafka sources have {{setCommitOffsetsOnCheckpoints}} set to true or false. The system is configured to use Presto for the S3 file system. The job has a small amount of state, so it is configured to use {{FsStateBackend}} with asynchronous snapshots. If the ip tables rules are removed, the job continues the function. I would expect the job to either fail or continue running if a checkpoint fails. -- This message was sent by Atlassian JIRA (v7.6.3#76005)