I don't have a strong preference. So I am also fine to deprecate the existing methods. Let's see what Jason thinks.
Can you update the KIP to reflect the semantics of the return `Map` (ie, does only contain entries for partitions with committed offsets, and does not contain `null` values)? +1 (binding) -Matthias On 9/10/19 11:53 AM, Guozhang Wang wrote: > Hi Jason / Matthias, > > I understand your concerns now. Just to clarify my main motivation on > deprecating the old APIs is not only for aesthetics (confess I personally > do have preference on succinct APIs), but to urge people to use the batched > API for better latency when possible --- as stated in the KIP, my > observation is that in practice most callers are actually going to get > committed offsets for more than one partitions, and without deprecating the > old APIs it would be hard for them to realize that the new API does have a > benefit in performance. > > This is different from some of the existing APIs though -- e.g., Matthias > mentioned about seek / seekToBeginning / seekToEnd, where only seekToXX > have plurals and seek only have singulars. We can, of course, make seekToXX > with plurals as well just like commitSync/Async, but since seeks are > non-blocking APIs (they rely on the follow-up polling call to talk to the > brokers) either calling it multiple times with one partition each v.s. > calling it one time with a plural makes little difference (still, I'd argue > that today we do not have a same-named function overloaded with both types > :P) On the other hand, committed, commitSync, offsetsForTimes etc blocking > calls are all in the form of plurals except > > * committed > * position > * partitionsFor > > My rationale was that 1) for consecutive calls of #position, mostly it > would only require a single round-trip to brokers since we are trying to > refresh fetching positions for all partitions anyways, and 2) for > #partitionsFor, theoretically we could also consider to ask for multiple > topics in one call since each blocking call potentially incurs one round > trip, but I did not include it in the scope of this KIP only because I have > not observed too many usage patterns that are commonly calling it > consecutively for multiple topics. At the moment, what I truly want to > "improve" on is the committed calls, as in many cases I've seen it being > called consecutively for multiple topic-partitions. > > Therefore, I'm still more inclined to deprecate the old APIs so that we can > enforce people to discover the new batching APIs for efficiency in this > KIP. But if you feel that this compatibility is very crucial to maintain I > could be convinced. > > > Guozhang > > On Tue, Sep 10, 2019 at 10:18 AM Matthias J. Sax <matth...@confluent.io> > wrote: > >> Thanks for the KIP Guozhang. >> >>> Another reason is that other functions of KafkaConsumers do not have >> those >>> overloaded functions to be consistent >> >> I tend to agree with Jason about keeping the existing methods. Your >> argument does not seem to hold. I just checked the `Consumer` API, and >> it's mix of overloads: >> >> Methods only talking `Collections` >> >> >> subscribe/assign/commitSync/commitAsyn/pause/resume/offsetsForTimes/beginningOffsets/endOffsets >> >> Method with overload taking `Collections` or as single value: >> >> seek/seekToBeginning/seekToEnd >> >> (those are strictly different methods, but they are semantically related) >> >> Only talking single value: >> >> position/committed/partitionsFor >> >> >> While you are strictly speaking correct, that there is no method with an >> overload for `Collection` and single value, the API mix seems to suggest >> that it might actually be worth to have corresponding overloads for all >> methods instead of sticking to `Collections` only. >> >> >> >> About the return map: I agree that not containing any entry in the map >> is better. It's not only consistent with other APIs but it also avoids >> potential NPEs. >> >> >> >> -Matthias >> >> >> On 9/10/19 10:04 AM, Jason Gustafson wrote: >>>> I feel it not worth making committed to have both plurals and >> singulars. >>> >>> Not sure I agree. If we had started with these new APIs from the >> beginning, >>> that may have been better, but we already have exposed the singular APIs >>> and users are depending on them. Not sure it's worth breaking >> compatibility >>> just for aesthetics. >>> >>> -Jason >>> >>> On Tue, Sep 10, 2019 at 9:41 AM Guozhang Wang <wangg...@gmail.com> >> wrote: >>> >>>> Thanks Jason! >>>> >>>> On Tue, Sep 10, 2019 at 9:07 AM Jason Gustafson <ja...@confluent.io> >>>> wrote: >>>> >>>>> Hi Guozhang, >>>>> >>>>> I think the motivation for the new API makes sense. I've wanted >> something >>>>> like this in the past. That said, do you think there is a substantial >>>>> benefit from deprecating the old API? I can still see it being >> convenient >>>>> in some cases and it's no real cost to maintain. >>>>> >>>>> >>>> That's a good question. >>>> >>>> Personally I would like to keep a succinct set of APIs out of the box >> and >>>> let users who want more syntax sugars to add themselves as extended >> classes >>>> for example (KafkaConsumer is not a final class). >>>> Another reason is that other functions of KafkaConsumers do not have >> those >>>> overloaded functions to be consistent, e.g. we do not have a >>>> subscribe(single-topic), pause/resume(single-topic-partition) or >>>> seekToBeginning(single-topic-partition). I feel it not worth making >>>> committed to have both plurals and singulars. >>>> >>>> WDYT? >>>> >>>> >>>>> Also, just a minor detail. If the partition has no committed offset, >> will >>>>> it be present in the map with a null value? >>>>> >>>>> I looked into the admin client's listConsumerGroupOffsets call when >>>> creating the KIP, and to be consistent with that API my intention is to >> NOT >>>> include the entry if a topic-partition does not have committed offsets. >>>> That said, if we feel returning an entry with null value is better for >>>> programmability I can also do that (and will update wiki page to >> clarify as >>>> well). LMK. >>>> >>>> >>>>> -Jason >>>>> >>>>> On Tue, Sep 10, 2019 at 6:09 AM Mickael Maison < >> mickael.mai...@gmail.com >>>>> >>>>> wrote: >>>>> >>>>>> +1 (non-binding), thanks Guozhang >>>>>> >>>>>> On Tue, Sep 10, 2019 at 1:14 AM Boyang Chen < >>>> reluctanthero...@gmail.com> >>>>>> wrote: >>>>>>> >>>>>>> Hey Guozhang, >>>>>>> >>>>>>> LGTM, +1 (non-binding) >>>>>>> >>>>>>> On Mon, Sep 9, 2019 at 5:07 PM Guozhang Wang <wangg...@gmail.com> >>>>> wrote: >>>>>>> >>>>>>>> Hello folks, >>>>>>>> >>>>>>>> I've created a new KIP allowing consumer.committed to take a set of >>>>>>>> partitions instead of just one partition to allow batching effects >>>> of >>>>>> such >>>>>>>> requests (the protocol already allows us to send multiple >>>> partitions >>>>>> in one >>>>>>>> round-trip): >>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>> >>>> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-520%3A+Add+overloaded+Consumer%23committed+for+batching+partitions >>>>>>>> >>>>>>>> Since it is a pretty straight-forward KIP, I'm starting the VOTE >>>> for >>>>>> this >>>>>>>> KIP directly. If there are any suggestions about this proposal, >>>>> please >>>>>> feel >>>>>>>> free to share them in this thread. Thank you! >>>>>>>> >>>>>>>> >>>>>>>> -- Guozhang >>>>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> -- Guozhang >>>> >>> >> >> >
signature.asc
Description: OpenPGP digital signature