Hi, We have a Flink stream job that uses Flink kafka consumer. Normally it commits consumer offsets to Kafka.
However this stream ended up in a state where it's otherwise working just fine, but it isn't committing offsets to Kafka any more. The job keeps writing correct aggregation results to the sink, though. At the time of writing this, the job has been running 14 hours without committing offsets. Below is an extract from taskmanager.log. As you can see, it didn't log anything until ~2018-06-07 22:08. Also that's where the log ends, these are the last lines so far. Could you help check if this is a know bug, possibly already fixed, or something new? I'm using a self-built Flink package 1.5-SNAPSHOT, flink commit 8395508b0401353ed07375e22882e7581d46ac0e which is not super old. Cheers, Juho 2018-06-06 10:01:33,498 INFO org.apache.kafka.common.utils.AppInfoParser - Kafka version : 0.10.2.1 2018-06-06 10:01:33,498 INFO org.apache.kafka.common.utils.AppInfoParser - Kafka commitId : e89bffd6b2eff799 2018-06-06 10:01:33,560 INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Discovered coordinator my-kafka-host-10-1-16-97.cloud-internal.mycompany.com:9092 (id: 2147483550 rack: null) for group aggregate-all_server_measurements_combined-20180606-1000. 2018-06-06 10:01:33,563 INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Discovered coordinator my-kafka-host-10-1-16-97.cloud-internal.mycompany.com:9092 (id: 2147483550 rack: null) for group aggregate-all_server_measurements_combined-20180606-1000. 2018-06-07 22:08:28,773 INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Marking the coordinator my-kafka-host-10-1-16-97.cloud-internal.mycompany.com:9092 (id: 2147483550 rack: null) dead for group aggregate-all_server_measurements_combined-20180606-1000 2018-06-07 22:08:28,776 WARN org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - Auto-commit of offsets {topic1-2=OffsetAndMetadata{offset=12300395550, metadata=''}, topic1-18=OffsetAndMetadata{offset=12299210444, metadata=''}, topic3-0=OffsetAndMetadata{offset=5064277287, metadata=''}, topic4-6=OffsetAndMetadata{offset=5492398559, metadata=''}, topic2-1=OffsetAndMetadata{offset=89817267, metadata=''}, topic1-10=OffsetAndMetadata{offset=12299742352, metadata=''}} failed for group aggregate-all_server_measurements_combined-20180606-1000: Offset commit failed with a retriable exception. You should retry committing offsets. 2018-06-07 22:08:29,840 INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Marking the coordinator my-kafka-host-10-1-16-97.cloud-internal.mycompany.com:9092 (id: 2147483550 rack: null) dead for group aggregate-all_server_measurements_combined-20180606-1000 2018-06-07 22:08:29,841 WARN org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - Auto-commit of offsets {topic1-6=OffsetAndMetadata{offset=12298347875, metadata=''}, topic4-2=OffsetAndMetadata{offset=5492779112, metadata=''}, topic1-14=OffsetAndMetadata{offset=12299972108, metadata=''}} failed for group aggregate-all_server_measurements_combined-20180606-1000: Offset commit failed with a retriable exception. You should retry committing offsets.