Hi Jack, I do understand the idea of extending the Partitioner interface so that people are now able to use headers in the partitioning decision, and I see that it’s filling in gap in the interface which was left when headers were originally added.
Experience with non-default partitioning schemes in the past makes me unlikely to use anything other than the default partitioning scheme. But I wouldn’t let that dissuade you. Thanks, Andrew > On 3 Aug 2023, at 13:43, Jack Tomy <jacktomy...@gmail.com> wrote: > > Hey Andrew, Sagar > > Please share your thoughts. Thanks. > > > > On Mon, Jul 31, 2023 at 5:58 PM Jack Tomy <jacktomy...@gmail.com> wrote: > >> Hey Andrew, Sagar >> >> Thanks. I'm travelling so sorry for being brief and getting back late. >> >> 1. For the first concern, that is moving in a direction of server side >> partitioner, the idea seems very much promising but I believe we still have >> a long way to go. Since the proposal/design for the same is still not >> available, it's hard for me to defend my proposal. >> 2. For the second concern: >> 2.1 Loss of order in messages, I believe the ordering of messages is >> never promised and the partitioner has no requirement to ensure the same. >> It is upto the user to implement/use a partitioner which ensures ordering >> based on keys. >> 2.2 Key deciding the partitioner, It is totally up to the user to decide >> the partition regardless of the key, we are also passing the value to the >> partitioner. Even the existing implementation receives the value which lets >> the user decide the partition based on value. >> 2.3 Sending to a specific partition, for this, I need to be aware of the >> total number of partitions, but if I can do that same in partitioner, the >> cluster param gives me all the information I want. >> >> I would also quote a line from KIP-82 where headers were added to the >> serializer : The payload is traditionally for the business object, and >> *headers >> are traditionally used for transport routing*, filtering etc. So I >> believe when a user wants to add some routing information (in this case >> which set of partitions to go for), headers seem to be the right place. >> >> >> >> On Sat, Jul 29, 2023 at 8:48 PM Sagar <sagarmeansoc...@gmail.com> wrote: >> >>> Hi Andrew, >>> >>> Thanks for your comments. >>> >>> 1) Yes that makes sense and that's what even would expect to see as well. >>> I >>> just wanted to highlight that we might still need a way to let client side >>> partitioning logic be present as well. Anyways, I am good on this point. >>> 2) The example provided does seem achievable by simply attaching the >>> partition number in the ProducerRecord. I guess if we can't find any >>> further examples which strengthen the case of this partitioner, it might >>> be >>> harder to justify adding it. >>> >>> >>> Thanks! >>> Sagar. >>> >>> On Fri, Jul 28, 2023 at 2:05 PM Andrew Schofield < >>> andrew_schofield_j...@outlook.com> wrote: >>> >>>> Hi Sagar, >>>> Thanks for your comments. >>>> >>>> 1) Server-side partitioning doesn’t necessarily mean that there’s only >>> one >>>> way to do it. It just means that the partitioning logic runs on the >>> broker >>>> and >>>> any configuration of partitioning applies to the broker’s partitioner. >>> If >>>> we ever >>>> see a KIP for this, that’s the kind of thing I would expect to see. >>>> >>>> 2) In the priority example in the KIP, there is a kind of contract >>> between >>>> the >>>> producers and consumers so that some records can be processed before >>>> others regardless of the order in which they were sent. The producer >>>> wants to apply special significance to a particular header to control >>> which >>>> partition is used. I would simply achieve this by setting the partition >>>> number >>>> in the ProducerRecord at the time of sending. >>>> >>>> I don’t think the KIP proposes adjusting the built-in partitioner or >>>> adding to AK >>>> a new one that uses headers in the partitioning decision. So, any >>>> configuration >>>> for a partitioner that does support headers would be up to the >>>> implementation >>>> of that specific partitioner. Partitioner implements Configurable. >>>> >>>> I’m just providing an alternative view and I’m not particularly opposed >>> to >>>> the KIP. >>>> I just don’t think it quite merits the work involved to get it voted and >>>> merged. >>>> As an aside, a long time ago, I created a small KIP that was never >>> adopted >>>> and I didn’t push it because I eventually didn’t need it. >>>> >>>> Thanks, >>>> Andrew >>>> >>>>> On 28 Jul 2023, at 05:15, Sagar <sagarmeansoc...@gmail.com> wrote: >>>>> >>>>> Hey Andrew, >>>>> >>>>> Thanks for the review. Since I had reviewed the KIP I thought I would >>>> also >>>>> respond. Of course Jack has the final say on this since he wrote the >>> KIP. >>>>> >>>>> 1) This is an interesting point and I hadn't considered it. The >>>>> comparison with KIP-848 is a valid one but even within that KIP, it >>>> allows >>>>> client side partitioning for power users like Streams. So while we >>> would >>>>> want to move away from client side partitioner as much as possible, we >>>>> still shouldn't do away completely with Client side partitioning and >>> end >>>> up >>>>> being in a state of inflexibility for different kinds of usecases. >>> This >>>> is >>>>> my opinion though and you have more context on Clients, so would like >>> to >>>>> know your thoughts on this. >>>>> >>>>> 2) Regarding this, I assumed that since the headers are already part >>> of >>>> the >>>>> consumer records they should have access to the headers and if there >>> is a >>>>> contract b/w the applications producing and the application consuming, >>>> that >>>>> decisioning should be transparent. Was my assumption incorrect? But as >>>> you >>>>> rightly pointed out header based partitioning with keys is going to >>> lead >>>> to >>>>> surprising results. Assuming there is merit in this proposal, do you >>>> think >>>>> we should ignore the keys in this case (similar to the effect of >>>>> setting *partitioner.ignore.keys >>>>> *config to false) and document it appropriately? >>>>> >>>>> Let me know what you think. >>>>> >>>>> Thanks! >>>>> Sagar. >>>>> >>>>> >>>>> On Thu, Jul 27, 2023 at 9:41 PM Andrew Schofield < >>>>> andrew_schofield_j...@outlook.com> wrote: >>>>> >>>>>> Hi Jack, >>>>>> Thanks for the KIP. I have a few concerns about the idea. >>>>>> >>>>>> 1) I think that while a client-side partitioner seems like a neat >>> idea >>>> and >>>>>> it’s an established part of Kafka, >>>>>> it’s one of the things which makes Kafka clients quite complicated. >>> Just >>>>>> as KIP-848 is moving from >>>>>> client-side assignors to server-side assignors, I wonder whether >>> really >>>> we >>>>>> should be looking to make >>>>>> partitioning a server-side capability too over time. So, I’m not >>>> convinced >>>>>> that making the Partitioner >>>>>> interface richer is moving in the right direction. >>>>>> >>>>>> 2) For records with a key, the partitioner usually calculates the >>>>>> partition from the key. This means >>>>>> that records with the same key end up on the same partition. Many >>>>>> applications expect this to give ordering. >>>>>> Log compaction expects this. There are situations in which records >>> have >>>> to >>>>>> be repartitioned, such as >>>>>> sometimes happens with Kafka Streams. I think that a header-based >>>>>> partitioner for records which have >>>>>> keys is going to be surprising and only going to have limited >>>>>> applicability as a result. >>>>>> >>>>>> The tricky part about clever partitioning is that downstream systems >>>> have >>>>>> no idea how the partition >>>>>> number was arrived at, so they do not truly understand how the >>> ordering >>>>>> was derived. I do think that >>>>>> perhaps there’s value to being able to influence the partitioning >>> using >>>>>> the headers, but I wonder if actually >>>>>> transforming the headers into an “ordering context” that then flows >>> with >>>>>> the record as it moves through >>>>>> the system would be a stronger solution. Today, the key is the >>> ordering >>>>>> context. Maybe it should be a >>>>>> concept in its own right and the Producer could configure a converter >>>> from >>>>>> headers to ordering context. >>>>>> That is of course a much bigger change. >>>>>> >>>>>> In one of the examples you mention in the KIP, you mentioned using a >>>>>> header to control priority. I guess the >>>>>> idea is to preferentially process records off specific partitions so >>>> they >>>>>> can overtake lower priority records. >>>>>> I suggest just sending the records explicitly to specific partitions >>> to >>>>>> achieve this. >>>>>> >>>>>> Sorry for the essay, but you did ask for people to share their >>> thoughts >>>> :) >>>>>> >>>>>> Just my opinion. Let’s see what others think. >>>>>> >>>>>> Thanks, >>>>>> Andrew >>>>>> >>>>>>> On 25 Jul 2023, at 14:58, Jack Tomy <jacktomy...@gmail.com> wrote: >>>>>>> >>>>>>> Hey @Sagar >>>>>>> >>>>>>> Thanks again for the review. >>>>>>> 1. "a null headers value is equivalent to invoking the older >>> partition >>>>>>> method", this is not true. If someone makes an implementation and >>> the >>>>>>> headers come as null, still the new implementation will take effect. >>>>>>> Instead I have added : "Not overriding this method in the >>> Partitioner >>>>>>> interface has the same behaviour as using the existing method." >>>>>>> 2. Corrected. >>>>>>> >>>>>>> Hey @Sagar and everyone, >>>>>>> Please have a look at the new version and share your thoughts. >>>>>>> >>>>>> >>>> >>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263424937 >>>>>>> Like Sagar mentioned, I would also request more people who have more >>>>>>> context on clients to chime in. >>>>>>> >>>>>>> >>>>>>> On Tue, Jul 25, 2023 at 2:58 PM Sagar <sagarmeansoc...@gmail.com> >>>> wrote: >>>>>>> >>>>>>>> Hi Jack, >>>>>>>> >>>>>>>> Thanks I have a couple of final comments and then I am good. >>>>>>>> >>>>>>>> 1) Can you elaborate on the Javadocs of the partition headers >>> argument >>>>>> to >>>>>>>> specify that a null headers value is equivalent to invoking the >>> older >>>>>>>> partition method? It is apparent but generally good to call out. >>>>>>>> 2) In the Compatibility section, you have mentioned backward >>>>>> comparable. I >>>>>>>> believe it should be *backward compatible change.* >>>>>>>> >>>>>>>> I don't have other comments. Post this, probably someone else who >>> has >>>>>> more >>>>>>>> context on Clients can also chime in on this before we can move >>> this >>>> to >>>>>>>> Voting. >>>>>>>> >>>>>>>> Thanks! >>>>>>>> Sagar. >>>>>>>> >>>>>>>> >>>>>>>> On Sat, Jul 22, 2023 at 10:09 AM Jack Tomy <jacktomy...@gmail.com> >>>>>> wrote: >>>>>>>> >>>>>>>>> Hey @Sagar, >>>>>>>>> >>>>>>>>> Thank you again for the response and feedback. >>>>>>>>> >>>>>>>>> 1. Though the ask wasn't very clear to me I have attached the >>>> Javadoc >>>>>>>> as >>>>>>>>> per your suggestion. Please have a look and let me know if this >>>> meets >>>>>>>>> the >>>>>>>>> expectations. >>>>>>>>> 2. Done. >>>>>>>>> 3. Done >>>>>>>>> 4. Done >>>>>>>>> >>>>>>>>> Hey @Sagar and everyone, >>>>>>>>> Please have a look at the new version and share your thoughts. >>>>>>>>> >>>>>>>> >>>>>> >>>> >>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263424937 >>>>>>>>> >>>>>>>>> On Thu, Jul 20, 2023 at 9:46 PM Sagar <sagarmeansoc...@gmail.com> >>>>>> wrote: >>>>>>>>> >>>>>>>>>> Thanks Jack for the updates. >>>>>>>>>> >>>>>>>>>> Some more feedback: >>>>>>>>>> >>>>>>>>>> 1) It would be better if you can add the Javadoc in the Public >>>>>>>> interfaces >>>>>>>>>> section. That is a general practice used which gives the readers >>> of >>>>>> the >>>>>>>>> KIP >>>>>>>>>> a high level idea of the Public Interfaces. >>>>>>>>>> >>>>>>>>>> 2) In the proposed section, the bit about marking headers as read >>>> only >>>>>>>>>> seems like an implementation detail This can generally be >>> avoided in >>>>>>>>> KIPs. >>>>>>>>>> >>>>>>>>>> 3) Also, in the Deprecation section, can you mention again that >>> this >>>>>>>> is a >>>>>>>>>> backward compatible change and the reason for it (already done in >>>> the >>>>>>>>>> Proposed Changes section). >>>>>>>>>> >>>>>>>>>> 4) In the Testing Plan section, there is still the KIP template >>> bit >>>>>>>>> copied >>>>>>>>>> over. That can be removed. >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>>>>> Sagar. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Jul 20, 2023 at 2:48 PM Jack Tomy <jacktomy...@gmail.com >>>> >>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hey Everyone, >>>>>>>>>>> >>>>>>>>>>> Please consider this as a reminder and share your feedback. >>> Thank >>>>>>>> you. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>> >>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263424937 >>>>>>>>>>> >>>>>>>>>>> On Tue, Jul 18, 2023 at 5:43 PM Jack Tomy < >>> jacktomy...@gmail.com> >>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hey @Sagar, >>>>>>>>>>>> >>>>>>>>>>>> Thank you for the response and feedback. >>>>>>>>>>>> >>>>>>>>>>>> 1. Done >>>>>>>>>>>> 2. Yeah, that was a mistake from my end. Corrected. >>>>>>>>>>>> 3. Can you please elaborate this, I have added the java doc >>>>>>>> along >>>>>>>>>> with >>>>>>>>>>>> the code changes. Should I paste the same in KIP too? >>>>>>>>>>>> 4. Moved. >>>>>>>>>>>> 5. I have added one more use case, it is actually helpful in >>> any >>>>>>>>>>>> situation where you want to pass some information to partition >>>>>>>>>> method >>>>>>>>>>> but >>>>>>>>>>>> don't have to have it in the key or value. >>>>>>>>>>>> 6. Added. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Hey @Sagar and everyone, >>>>>>>>>>>> Please have a look at the new version and share your thoughts. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>> >>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263424937 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Jul 18, 2023 at 9:53 AM Sagar < >>> sagarmeansoc...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Jack, >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for the KIP! Seems like an interesting idea. I have >>> some >>>>>>>>>>> feedback: >>>>>>>>>>>>> >>>>>>>>>>>>> 1) It would be great if you could clean up the text that >>> seems to >>>>>>>>>> mimic >>>>>>>>>>>>> the >>>>>>>>>>>>> KIP template. It is generally not required in the KIP. >>>>>>>>>>>>> >>>>>>>>>>>>> 2) In the Public Interfaces where you mentioned *Partitioner >>>>>>>> method >>>>>>>>> in >>>>>>>>>>>>> **org/apache/kafka/clients/producer >>>>>>>>>>>>> will have the following update*, I believe you meant the >>>>>>>> Partitioner >>>>>>>>>>>>> *interface*? >>>>>>>>>>>>> >>>>>>>>>>>>> 3) Staying on Public Interface, it is generally preferable to >>> add >>>>>>>> a >>>>>>>>>>>>> Javadocs section along with the newly added method. You could >>>> also >>>>>>>>>>>>> describe >>>>>>>>>>>>> the behaviour of it invoking the default existing method. >>>>>>>>>>>>> >>>>>>>>>>>>> 4) The option that is mentioned in the Rejected Alternatives, >>>>>>>> seems >>>>>>>>>> more >>>>>>>>>>>>> like a workaround to the current problem that you are >>> describing. >>>>>>>>> That >>>>>>>>>>>>> could be added to the Motivation section IMO. >>>>>>>>>>>>> >>>>>>>>>>>>> 5) Can you also add some more examples of scenarios where this >>>>>>>> would >>>>>>>>>> be >>>>>>>>>>>>> helpful? The only scenario mentioned seems to have a >>> workaround. >>>>>>>>> Just >>>>>>>>>>>>> trying to ensure that we have a strong enough motivation >>> before >>>>>>>>>> adding a >>>>>>>>>>>>> public API. >>>>>>>>>>>>> >>>>>>>>>>>>> 6) One thing which should also be worth noting down would be >>> what >>>>>>>>>>> happens >>>>>>>>>>>>> if users override both methods, only one method (new or old) >>> and >>>>>>>> no >>>>>>>>>>>>> methods >>>>>>>>>>>>> (the default behaviour). It would help in understanding the >>>>>>>> proposal >>>>>>>>>>>>> better. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks! >>>>>>>>>>>>> Sagar. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Jul 17, 2023 at 9:19 PM Jack Tomy < >>> jacktomy...@gmail.com >>>>> >>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hey everyone, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Not seeing much discussion on the KPI. Might be because it is >>>>>>>> too >>>>>>>>>>>>>> obvious 😉. >>>>>>>>>>>>>> >>>>>>>>>>>>>> If there are no more comments, I will start the VOTE in the >>>>>>>> coming >>>>>>>>>>> days. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Sat, Jul 15, 2023 at 8:48 PM Jack Tomy < >>>>>>>> jacktomy...@gmail.com> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hey everyone, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please take a look at the KPI below and provide your >>>>>>>> suggestions >>>>>>>>>> and >>>>>>>>>>>>>>> feedback. TIA. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>> >>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263424937 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Best Regards >>>>>>>>>>>>>>> *Jack* >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Best Regards >>>>>>>>>>>>>> *Jack* >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Best Regards >>>>>>>>>>>> *Jack* >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Best Regards >>>>>>>>>>> *Jack* >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Best Regards >>>>>>>>> *Jack* >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best Regards >>>>>>> *Jack* >>>>>> >>>>>> >>>> >>>> >>> >> >> >> -- >> Best Regards >> *Jack* >> > > > -- > Best Regards > *Jack*