Hello, I'm wondering how, in the event of a poison pill record on Kafka, to advance a partition's checkpointed offsets by 1 when using the TableAPI/SQL.
It is my understanding that when checkpointing is enabled Flink uses its own checkpoint committed offsets and not the offsets committed to Kafka when starting a job from a checkpoint. In the event that there is a poison pill record in Kafka that is crashing the Flink job, we may want to simply advance our checkpointed offsets by 1 for the partition, past the poison record, and then continue operation as normal. We do not want to lose any other state in Flink however. I'm wondering how to go about this then. It's easy enough to have Kafka advance its committed offsets. Is there a way to tell Flink to ignore checkpointed offsets and instead respect the offsets committed to Kafka for a consumer group when restoring from a checkpoint? If so we could: 1. Advance Kafka's offsets. 2. Run our job from the checkpoint and have it use Kafka's offsets and then checkpoint with new Kafka offsets. 3. Stop the job, and rerun it using Flink's committed, now advanced, offsets. Is this possible? Are there any better strategies? Thanks! -- Rex Fenley | Software Engineer - Mobile and Backend Remind.com <https://www.remind.com/> | BLOG <http://blog.remind.com/> | FOLLOW US <https://twitter.com/remindhq> | LIKE US <https://www.facebook.com/remindhq>