Thanks everyone for all the feedback.

If there is no further concerns or comments I will start a voting thread on
this KIP tomorrow.

Thanks,

Jiangjie (Becket) Qin

On Tue, Sep 6, 2016 at 9:48 AM, Becket Qin <becket....@gmail.com> wrote:

> Hi Magnus,
>
> Thanks for the comments. I agree that querying messages within a time
> range is a valid use case (actually this is an example use case in my
> previous email). The current proposal can achieve this by having two
> ListOffsetRequest, right? I think the current API already supports the use
> cases that require the offsets for multiple timestamps. The question is
> that whether it is worth adding more complexity to the protocol to make it
> easier for multiple timestamp query. Personally I think given that query
> multiple timestamps is likely an infrequent operation, there is no need to
> optimize for it and complicates the protocol.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Mon, Sep 5, 2016 at 11:21 PM, Magnus Edenhill <mag...@edenhill.se>
> wrote:
>
>> Good write-up Qin, the API looks promising.
>>
>> I have one comment:
>>
>> 2016-09-03 5:20 GMT+02:00 Becket Qin <becket....@gmail.com>:
>>
>> > The currently offsetsForTimes() API obviously does not support querying
>> > multiple timestamps for the same partition. It doesn't seems a feature
>> for
>> > ListOffsetRequest v0 either (sounds more like a bug). My intuition is
>> that
>> > it's a rare use case. Given it does not exist before and we don't see a
>> > strong need from the community either, maybe it is better to keep it
>> simple
>> > for ListOffsetRequest v1. We can add it later if it turns out to be a
>> > useful feature (that may need a interface change, but I honestly do not
>> > think people would frequently query many different timestamps for the
>> same
>> > partition)
>> >
>>
>> I argue that the current behaviour of OffsetRequest with regards to
>> duplicate partitions is a bug
>> and think it would be a mistake to move the same semantics over to thew
>> new
>> ListOffset API.
>> One use case is that an application may want to know the offset range
>> between two timestamps,
>> e.g., for reprocessing, batching, searching, etc.
>>
>>
>> Thanks,
>> Magnus
>>
>>
>>
>> >
>> > Have a good long weekend!
>> >
>> > Thanks,
>> >
>> > Jiangjie (Becket) Qin
>> >
>> >
>> >
>> >
>> > On Fri, Sep 2, 2016 at 6:10 PM, Ismael Juma <ism...@juma.me.uk> wrote:
>> >
>> > > Thanks for the proposal Becket. Looks good overall, a few comments:
>> > >
>> > > ListOffsetResponse => [TopicName [PartitionOffsets]]
>> > > >   PartitionOffsets => Partition ErrorCode Timestamp [Offset]
>> > > >   Partition => int32
>> > > >   ErrorCode => int16
>> > > >   Timestamp => int64
>> > > >   Offset => int
>> > >
>> > >
>> > > It should be int64 for `Offset` right?
>> > >
>> > > Implementation wise, we will migrate to o.a.k.common.requests.
>> > > ListOffsetRequest
>> > > > class on the broker side.
>> > >
>> > >
>> > > Could you clarify what you mean here? We already
>> > > use o.a.k.common.requests.ListOffsetRequest in KafkaApis.
>> > >
>> > > long offset = consumer.offsetForTime(Collections.singletonMap(
>> > > topicPartition,
>> > > > targetTime)).offset;
>> > >
>> > >
>> > > The result of `offsetForTime` is a Map, so we can't just call
>> `offset` on
>> > > it. You probably meant something like:
>> > >
>> > > long offset = consumer.offsetForTime(Collections.singletonMap(
>> > > topicPartition,
>> > > targetTime)).get(topicPartition).offset;
>> > >
>> > > Test searchByTimestamp with CreateTime and LogAppendTime
>> > > >
>> > >
>> > > Do you mean `Test offsetForTime`?
>> > >
>> > > And:
>> > >
>> > > 1. In KAFKA-1588, the following issue was described "When performing
>> an
>> > > OffsetRequest, if you request the same topic and partition combination
>> > in a
>> > > single request more than once (for example, if you want to get both
>> the
>> > > head and tail offsets for a partition in the same request), you will
>> get
>> > a
>> > > response for both, but they will be the same offset". Will the new
>> > request
>> > > version support the use case where multiple timestamps are passed for
>> the
>> > > same topic partition? And if we do support it at the protocol level,
>> do
>> > we
>> > > also want to support it at the API level or do we think the additional
>> > > complexity is not worth it?
>> > >
>> > > 2. Is `offsetForTime` the right method name given that we are getting
>> > > multiple offsets? Maybe it should be `offsetsForTimes` or something
>> like
>> > > that.
>> > >
>> > > Ismael
>> > >
>> > > On Wed, Aug 31, 2016 at 4:38 AM, Becket Qin <becket....@gmail.com>
>> > wrote:
>> > >
>> > > > Hi Kafka devs,
>> > > >
>> > > > I created KIP-79 to allow consumer to precisely query the offsets
>> based
>> > > on
>> > > > timestamp.
>> > > >
>> > > > In short we propose to :
>> > > > 1. add a ListOffsetRequest/ListOffsetResponse v1, and
>> > > > 2. add an offsetForTime() method in new consumer.
>> > > >
>> > > > The KIP wiki is the following:
>> > > > https://cwiki.apache.org/confluence/pages/viewpage.
>> > > action?pageId=65868090
>> > > >
>> > > > Comments are welcome.
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Jiangjie (Becket) Qin
>> > > >
>> > >
>> >
>>
>
>

Reply via email to