Hi Steffen, Turns out that FLINK-4514 just missed Flink 1.1.2 and wasn’t included in the release (I’ll update the resolve version in JIRA to 1.1.3, thanks for noticing this!). The Flink community is going to release 1.1.3 asap, which will include the fix. If you don’t want to wait for the release and want to try the fix now, you can also build on the current “release-1.1” branch, which already has FLINK-4514 merged. Sorry for the inconvenience. Let me know if you bump into any other problems afterwards.
Best Regards, Gordon On October 5, 2016 at 2:56:21 AM, Steffen Hausmann (stef...@hausmann-family.de) wrote: Hi there, I'm running a Flink 1.1.2 job on EMR and Yarn that is reading events from a Kinesis stream. However, after a while (the exact duration varies and is in the order of minutes) the Kinesis source doesn't emit any further events and hence Flink doesn't produce any further output. Eventually, an ExpiredIteratorException occurs in one of the task, causing the entire job to fail: > com.amazonaws.services.kinesis.model.ExpiredIteratorException: Iterator > expired. The iterator was created at time Mon Oct 03 18:40:30 UTC 2016 while > right now it is Mon Oct 03 18:45:33 UTC 2016 which is further in the future > than the tolerated delay of 300000 milliseconds. (Service: AmazonKinesis; > Status Code: 400; Error Code: ExpiredIteratorException; Request ID: > dace9532-9031-54bc-8aa2-3cbfb136d590) This seems to be related to FLINK-4514, which is marked as resovled for Flink 1.1.2. In contrast to what is describe in the ticket, the job I'm running isn't suspended but hangs just a few minutes after the job has been started. I've attached a log file showing the described behavior. Any idea what may be wrong? Thanks, Steffen