Hi Stephan, Once the job restarts due to an async io operator timeout we notice that its checkpoints never succeed again. But the job is running fine and is processing data. ~ Karthik
On Mon, Oct 9, 2017 at 3:19 PM, Stephan Ewen <se...@apache.org> wrote: > As long as this does not appear all the time, but only once in a while, it > should not be a problem. > It simply means that this particular checkpoint could not be triggered, > because some sources were not ready yet. > > It should try another checkpoint and then be okay. > > > On Fri, Oct 6, 2017 at 4:53 PM, Karthik Deivasigamani <karthi...@gmail.com > > wrote: > >> We are using Flink 1.3.1 in Standalone mode with a HA job manager setup. >> ~ >> Karthik >> >> On Fri, Oct 6, 2017 at 8:22 PM, Karthik Deivasigamani < >> karthi...@gmail.com> wrote: >> >>> Hi, >>> I'm noticing a weird issue with our flink streaming job. We use >>> async io operator which makes a HTTP call and in certain cases when the >>> async task times out, it throws an exception and causing the job to >>> restart. >>> >>> java.lang.Exception: An async function call terminated with an exception. >>> Failing the AsyncWaitOperator. >>> at >>> org.apache.flink.streaming.api.operators.async.Emitter.output(Emitter.java:136) >>> at >>> org.apache.flink.streaming.api.operators.async.Emitter.run(Emitter.java:83) >>> at java.lang.Thread.run(Thread.java:745) >>> Caused by: java.util.concurrent.ExecutionException: >>> java.util.concurrent.TimeoutException: Async function call has timed out. >>> at >>> org.apache.flink.runtime.concurrent.impl.FlinkFuture.get(FlinkFuture.java:110) >>> >>> >>> After the job restarts(we have a fixed restart strategy) we notice that >>> the checkpoints start failing continuously with this message : >>> Checkpoint was declined (tasks not ready) >>> >>> [image: Inline image 1] >>> >>> But we see the job is running, its processing data, the accumulators we >>> have are getting incremented etc but checkpointing fails with tasks not >>> ready message. >>> >>> Wanted to reach out to the community to see if anyone else has >>> experienced this issue before? >>> ~ >>> Karthik >>> >> >> >