I added this question to the FAQ as it frequently comes up - https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-HowdoIaccuratelygetoffsetsofmessagesforacertaintimestampusingOffsetFetchRequest ?
On Tue, Sep 2, 2014 at 1:48 PM, Guozhang Wang <wangg...@gmail.com> wrote: > The semantic of the offset API is to "return the latest possible offset of > the message that is appended no later than the given timestamp". For > implementation, it will get the starting offset of the log segment that is > created no later than the given timestamp, and hence if your log segment > contains data for a long period of time, then the offset API may return you > just the starting offset of the current log segment. > > If your traffic is small and you still want a finer grained offset > response, you can try to reduce the log segment size (default to 1 GB); > however doing so will increase the number of file handlers with more > frequent log segment rolling. > > Guozhang > > > On Tue, Sep 2, 2014 at 10:21 AM, Manjunath Shivakumar < > manjunath.shivaku...@betfair.com> wrote: > > > Hi, > > > > My usecase is to fetch the offsets for a given topic from X milliseconds > > ago. > > If I use the offset api > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetAPI > > > > to do this and pass in a timestamp of (now() - X), I get the earliest > > offset in the current log segment and not the offset from X milliseconds > > ago. > > > > Is this the correct usage or behaviour? > > > > Thanks, > > Manju > > > > ________________________________________________________________________ > > In order to protect our email recipients, Betfair Group use SkyScan from > > MessageLabs to scan all Incoming and Outgoing mail for viruses. > > > > ________________________________________________________________________ > > > > > -- > -- Guozhang >