Thank you Vino. It is very helpful.

________________________________
From: vino yang <yanghua1...@gmail.com>
Sent: Tuesday, August 7, 2018 7:22:50 PM
To: Yan Zhou [FDS Science]
Cc: user
Subject: Re: checkpoint recovery behavior when kafka source is set to start 
from timestamp

Hi Yan Zhou:

I think the java doc of the setStartFromTimestamp method has been explained 
very clearly, posted here:

/**
* Specify the consumer to start reading partitions from a specified timestamp.
* The specified timestamp must be before the current timestamp.
* This lets the consumer ignore any committed group offsets in Zookeeper / 
Kafka brokers.
*
* <p>The consumer will look up the earliest offset whose timestamp is greater 
than or equal
* to the specific timestamp from Kafka. If there's no such offset, the consumer 
will use the
* latest offset to read data from kafka.
*
* <p>This method does not affect where partitions are read from when the 
consumer is restored
* from a checkpoint or savepoint. When the consumer is restored from a 
checkpoint or
* savepoint, only the offsets in the restored state will be used.
*
* @param startupOffsetsTimestamp timestamp for the startup offsets, as 
milliseconds from epoch.
*
* @return The consumer object, to allow function chaining.
*/

Thanks, vino.

Yan Zhou [FDS Science] <yz...@coupang.com<mailto:yz...@coupang.com>> 
于2018年8月8日周三 上午9:06写道:

Hi Experts,


In my application, the kafka source is set to start from a specified timestamp, 
by calling method FlinkKafkaConsumer010#setStartFromTimestamp(long 
startupOffsetsTimestamp).


If the application have run a while and then recover from a checkpoint because 
of failure, what's the offset will the kafka source to read from? I suppose it 
will read from the offset that has been committed before the failure. Is it 
right?


I am going to verify it, however some clarification is good in case my test 
result doesn't meet my assumption.


Best

Yan

Reply via email to