Thank you Vino. It is very helpful. ________________________________ From: vino yang <yanghua1...@gmail.com> Sent: Tuesday, August 7, 2018 7:22:50 PM To: Yan Zhou [FDS Science] Cc: user Subject: Re: checkpoint recovery behavior when kafka source is set to start from timestamp
Hi Yan Zhou: I think the java doc of the setStartFromTimestamp method has been explained very clearly, posted here: /** * Specify the consumer to start reading partitions from a specified timestamp. * The specified timestamp must be before the current timestamp. * This lets the consumer ignore any committed group offsets in Zookeeper / Kafka brokers. * * <p>The consumer will look up the earliest offset whose timestamp is greater than or equal * to the specific timestamp from Kafka. If there's no such offset, the consumer will use the * latest offset to read data from kafka. * * <p>This method does not affect where partitions are read from when the consumer is restored * from a checkpoint or savepoint. When the consumer is restored from a checkpoint or * savepoint, only the offsets in the restored state will be used. * * @param startupOffsetsTimestamp timestamp for the startup offsets, as milliseconds from epoch. * * @return The consumer object, to allow function chaining. */ Thanks, vino. Yan Zhou [FDS Science] <yz...@coupang.com<mailto:yz...@coupang.com>> 于2018年8月8日周三 上午9:06写道: Hi Experts, In my application, the kafka source is set to start from a specified timestamp, by calling method FlinkKafkaConsumer010#setStartFromTimestamp(long startupOffsetsTimestamp). If the application have run a while and then recover from a checkpoint because of failure, what's the offset will the kafka source to read from? I suppose it will read from the offset that has been committed before the failure. Is it right? I am going to verify it, however some clarification is good in case my test result doesn't meet my assumption. Best Yan