I don't have a strong preference. So I am also fine to deprecate the
existing methods. Let's see what Jason thinks.

Can you update the KIP to reflect the semantics of the return `Map` (ie,
does only contain entries for partitions with committed offsets, and
does not contain `null` values)?


+1 (binding)

-Matthias




On 9/10/19 11:53 AM, Guozhang Wang wrote:
> Hi Jason / Matthias,
> 
> I understand your concerns now. Just to clarify my main motivation on
> deprecating the old APIs is not only for aesthetics (confess I personally
> do have preference on succinct APIs), but to urge people to use the batched
> API for better latency when possible --- as stated in the KIP, my
> observation is that in practice most callers are actually going to get
> committed offsets for more than one partitions, and without deprecating the
> old APIs it would be hard for them to realize that the new API does have a
> benefit in performance.
> 
> This is different from some of the existing APIs though -- e.g., Matthias
> mentioned about seek / seekToBeginning / seekToEnd, where only seekToXX
> have plurals and seek only have singulars. We can, of course, make seekToXX
> with plurals as well just like commitSync/Async, but since seeks are
> non-blocking APIs (they rely on the follow-up polling call to talk to the
> brokers) either calling it multiple times with one partition each v.s.
> calling it one time with a plural makes little difference (still, I'd argue
> that today we do not have a same-named function overloaded with both types
> :P) On the other hand, committed, commitSync, offsetsForTimes etc blocking
> calls are all in the form of plurals except
> 
> * committed
> * position
> * partitionsFor
> 
> My rationale was that 1) for consecutive calls of #position, mostly it
> would only require a single round-trip to brokers since we are trying to
> refresh fetching positions for all partitions anyways, and 2) for
> #partitionsFor, theoretically we could also consider to ask for multiple
> topics in one call since each blocking call potentially incurs one round
> trip, but I did not include it in the scope of this KIP only because I have
> not observed too many usage patterns that are commonly calling it
> consecutively for multiple topics. At the moment, what I truly want to
> "improve" on is the committed calls, as in many cases I've seen it being
> called consecutively for multiple topic-partitions.
> 
> Therefore, I'm still more inclined to deprecate the old APIs so that we can
> enforce people to discover the new batching APIs for efficiency in this
> KIP. But if you feel that this compatibility is very crucial to maintain I
> could be convinced.
> 
> 
> Guozhang
> 
> On Tue, Sep 10, 2019 at 10:18 AM Matthias J. Sax <matth...@confluent.io>
> wrote:
> 
>> Thanks for the KIP Guozhang.
>>
>>> Another reason is that other functions of KafkaConsumers do not have
>> those
>>> overloaded functions to be consistent
>>
>> I tend to agree with Jason about keeping the existing methods. Your
>> argument does not seem to hold. I just checked the `Consumer` API, and
>> it's mix of overloads:
>>
>> Methods only talking `Collections`
>>
>>
>> subscribe/assign/commitSync/commitAsyn/pause/resume/offsetsForTimes/beginningOffsets/endOffsets
>>
>> Method with overload taking `Collections` or as single value:
>>
>> seek/seekToBeginning/seekToEnd
>>
>> (those are strictly different methods, but they are semantically related)
>>
>> Only talking single value:
>>
>> position/committed/partitionsFor
>>
>>
>> While you are strictly speaking correct, that there is no method with an
>> overload for `Collection` and single value, the API mix seems to suggest
>> that it might actually be worth to have corresponding overloads for all
>> methods instead of sticking to `Collections` only.
>>
>>
>>
>> About the return map: I agree that not containing any entry in the map
>> is better. It's not only consistent with other APIs but it also avoids
>> potential NPEs.
>>
>>
>>
>> -Matthias
>>
>>
>> On 9/10/19 10:04 AM, Jason Gustafson wrote:
>>>>  I feel it not worth making committed to have both plurals and
>> singulars.
>>>
>>> Not sure I agree. If we had started with these new APIs from the
>> beginning,
>>> that may have been better, but we already have exposed the singular APIs
>>> and users are depending on them. Not sure it's worth breaking
>> compatibility
>>> just for aesthetics.
>>>
>>> -Jason
>>>
>>> On Tue, Sep 10, 2019 at 9:41 AM Guozhang Wang <wangg...@gmail.com>
>> wrote:
>>>
>>>> Thanks Jason!
>>>>
>>>> On Tue, Sep 10, 2019 at 9:07 AM Jason Gustafson <ja...@confluent.io>
>>>> wrote:
>>>>
>>>>> Hi Guozhang,
>>>>>
>>>>> I think the motivation for the new API makes sense. I've wanted
>> something
>>>>> like this in the past. That said, do you think there is a substantial
>>>>> benefit from deprecating the old API? I can still see it being
>> convenient
>>>>> in some cases and it's no real cost to maintain.
>>>>>
>>>>>
>>>> That's a good question.
>>>>
>>>> Personally I would like to keep a succinct set of APIs out of the box
>> and
>>>> let users who want more syntax sugars to add themselves as extended
>> classes
>>>> for example (KafkaConsumer is not a final class).
>>>> Another reason is that other functions of KafkaConsumers do not have
>> those
>>>> overloaded functions to be consistent, e.g. we do not have a
>>>> subscribe(single-topic), pause/resume(single-topic-partition) or
>>>> seekToBeginning(single-topic-partition). I feel it not worth making
>>>> committed to have both plurals and singulars.
>>>>
>>>> WDYT?
>>>>
>>>>
>>>>> Also, just a minor detail. If the partition has no committed offset,
>> will
>>>>> it be present in the map with a null value?
>>>>>
>>>>> I looked into the admin client's listConsumerGroupOffsets call when
>>>> creating the KIP, and to be consistent with that API my intention is to
>> NOT
>>>> include the entry if a topic-partition does not have committed offsets.
>>>> That said, if we feel returning an entry with null value is better for
>>>> programmability I can also do that (and will update wiki page to
>> clarify as
>>>> well). LMK.
>>>>
>>>>
>>>>> -Jason
>>>>>
>>>>> On Tue, Sep 10, 2019 at 6:09 AM Mickael Maison <
>> mickael.mai...@gmail.com
>>>>>
>>>>> wrote:
>>>>>
>>>>>> +1 (non-binding), thanks Guozhang
>>>>>>
>>>>>> On Tue, Sep 10, 2019 at 1:14 AM Boyang Chen <
>>>> reluctanthero...@gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Hey Guozhang,
>>>>>>>
>>>>>>> LGTM, +1 (non-binding)
>>>>>>>
>>>>>>> On Mon, Sep 9, 2019 at 5:07 PM Guozhang Wang <wangg...@gmail.com>
>>>>> wrote:
>>>>>>>
>>>>>>>> Hello folks,
>>>>>>>>
>>>>>>>> I've created a new KIP allowing consumer.committed to take a set of
>>>>>>>> partitions instead of just one partition to allow batching effects
>>>> of
>>>>>> such
>>>>>>>> requests (the protocol already allows us to send multiple
>>>> partitions
>>>>>> in one
>>>>>>>> round-trip):
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-520%3A+Add+overloaded+Consumer%23committed+for+batching+partitions
>>>>>>>>
>>>>>>>> Since it is a pretty straight-forward KIP, I'm starting the VOTE
>>>> for
>>>>>> this
>>>>>>>> KIP directly. If there are any suggestions about this proposal,
>>>>> please
>>>>>> feel
>>>>>>>> free to share them in this thread. Thank you!
>>>>>>>>
>>>>>>>>
>>>>>>>> -- Guozhang
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> -- Guozhang
>>>>
>>>
>>
>>
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to