Hello folk, This thread is pending for a long time, I want to bump this thread and get more feedback. Any questions are welcome.
Best, TaiJuWu TaiJu Wu <tjwu1...@gmail.com> 於 2024年11月23日 週六 下午9:15寫道: > Hi Chia-Ping, > > Sorry for late reply and thanks for your feedback to make this KIP more > valuable. > After initial verification, I think this can do without large changes. > > I have updated KIP, thanks a lot. > > Best, > TaiJuWu > > > Chia-Ping Tsai <chia7...@apache.org> 於 2024年11月20日 週三 下午5:06寫道: > >> hi TaiJuWu >> >> Is there a possibility to extend this KIP to include topic-level >> compression for the producer? This is another issue that prevents us from >> sharing producers across different threads, as it's common to use different >> compression types for different topics (data). >> >> Best, >> Chia-Ping >> >> On 2024/11/18 08:36:25 TaiJu Wu wrote: >> > Hi Chia-Ping, >> > >> > Thanks for your suggestions and feedback. >> > >> > Q1: I have updated this according your suggestions. >> > Q2: This is necessary change since there is a assumption about >> > *RecourdAccumulator >> > *that all records have same acks(e.g. ProducerConfig.acks) so we need >> to a >> > method to distinguish which acks belong to each Batch. >> > >> > Best, >> > TaiJuWu >> > >> > Chia-Ping Tsai <chia7...@apache.org> 於 2024年11月18日 週一 上午2:17寫道: >> > >> > > hi TaiJuWu >> > > >> > > Q0: >> > > >> > > `Format: topic.acks` the dot is acceptable character in topic >> naming, so >> > > maybe we should reverse the format to "acks.${topic}" to get the acks >> of >> > > topic easily >> > > >> > > Q1: `Return Map<Acks, List<ProducerBatch>> when >> > > RecordAccumulator#drainBatchesForOneNode is called.` >> > > >> > > this is weird to me, as all we need to do is pass `Map<String, Acks> >> to >> > > `Sender` and make sure `Sender#sendProduceRequest` add correct acks to >> > > ProduceRequest, right? >> > > >> > > Best, >> > > Chia-Ping >> > > >> > > >> > > >> > > On 2024/11/15 05:12:33 TaiJu Wu wrote: >> > > > Hi all, >> > > > >> > > > I have updated the contents of this KIP >> > > > Please take a look and let me know what you think. >> > > > >> > > > Thanks, >> > > > TaiJuWu >> > > > >> > > > On Thu, Nov 14, 2024 at 2:21 PM TaiJu Wu <tjwu1...@gmail.com> >> wrote: >> > > > >> > > > > Hi all, >> > > > > >> > > > > Thanks for your feeback and @Chia-Ping's help. >> > > > > . >> > > > > I also agree topic-level acks config is more reasonable and it can >> > > simply >> > > > > the story. >> > > > > When I try implementing record-level acks, I notice I don't have >> good >> > > idea >> > > > > to avoid iterating batches for get partition information (need by >> > > > > *RecordAccumulator#partitionChanged*). >> > > > > >> > > > > Back to the init question how can I handle different acks for >> batches: >> > > > > First, we can attach *topic-level acks *to >> > > *RecordAccumulator#TopicInfo*. >> > > > > Second, we can return *Map<Acks, List<ProducerBatch>>* when >> > > *RecordAccumulator#drainBatchesForOneNode >> > > > > *is called. In this step, we can propagate acks to *sender*. >> > > > > Finally, we can get the acks info and group same acks into a >> > > > > *List<ProducerBatch>>* for a node in *sender#sendProduceRequests*. >> > > > > >> > > > > If I missed something or there is any mistake, please let me know. >> > > > > I will update this KIP later, thank your feedback. >> > > > > >> > > > > Best, >> > > > > TaiJuWu >> > > > > >> > > > > >> > > > > Chia-Ping Tsai <chia7...@apache.org> 於 2024年11月14日 週四 上午9:46寫道: >> > > > > >> > > > >> hi All >> > > > >> >> > > > >> This KIP is based on our use case where an edge application with >> many >> > > > >> sensors wants to use a single producer to deliver ‘few but >> varied’ >> > > records >> > > > >> with different acks settings. The reason for using a single >> producer >> > > is to >> > > > >> minimize resource usage on edge devices with limited hardware >> > > capabilities. >> > > > >> Currently, we use a producer pool to handle different acks >> values, >> > > which >> > > > >> requires 3x producer instances. Additionally, this approach >> creates >> > > many >> > > > >> idle producers if a sensor with a specific acks setting has no >> data >> > > for a >> > > > >> while. >> > > > >> >> > > > >> I love David’s suggestion since the acks configuration is closely >> > > related >> > > > >> to the topic. Maybe we can introduce an optional configuration >> in the >> > > > >> producer to define topic-level acks, with the existing acks >> being the >> > > > >> default for all topics. This approach is not only simple but also >> > > easy to >> > > > >> understand and implement. >> > > > >> >> > > > >> Best, >> > > > >> Chia-Ping >> > > > >> >> > > > >> On 2024/11/13 16:04:24 Andrew Schofield wrote: >> > > > >> > Hi TaiJuWu, >> > > > >> > I've been thinking for a while about this KIP before jumping >> into >> > > the >> > > > >> discussion. >> > > > >> > >> > > > >> > I'm afraid that I don't think the approach in the KIP is the >> best, >> > > > >> given the design >> > > > >> > of the Kafka protocol in this area. Essentially, each Produce >> > > request >> > > > >> contains >> > > > >> > the acks value at the top level, and may contain records for >> many >> > > > >> topics or >> > > > >> > partitions. My point is that batching occurs at the level of a >> > > Produce >> > > > >> request, >> > > > >> > so changing the acks value between records will require a new >> > > Produce >> > > > >> request >> > > > >> > to be sent. There would likely be an efficiency penalty if this >> > > feature >> > > > >> was used >> > > > >> > heavily with the acks changing record by record. >> > > > >> > >> > > > >> > I can see that potentially an application might want different >> ack >> > > > >> levels for >> > > > >> > different topics, but I would be surprised if they use >> different ack >> > > > >> levels within >> > > > >> > the same topic. Maybe David's suggestion of defining the acks >> per >> > > topic >> > > > >> > would be enough. What do you think? >> > > > >> > >> > > > >> > Thanks, >> > > > >> > Andrew >> > > > >> > ________________________________________ >> > > > >> > From: David Jacot <dja...@confluent.io.INVALID> >> > > > >> > Sent: 13 November 2024 15:31 >> > > > >> > To: dev@kafka.apache.org <dev@kafka.apache.org> >> > > > >> > Subject: Re: [DISCUSS]KIP-1107: Adding record-level acks for >> > > producers >> > > > >> > >> > > > >> > Hi TaiJuWu, >> > > > >> > >> > > > >> > Thanks for the KIP. >> > > > >> > >> > > > >> > The motivation is not clear to me. Could you please elaborate >> a bit >> > > > >> more on >> > > > >> > it? >> > > > >> > >> > > > >> > My concern is that it adds a lot of complexity and the added >> value >> > > > >> seems to >> > > > >> > be low. Moreover, it will make reasoning about an application >> from >> > > the >> > > > >> > server side more difficult because we can no longer assume >> that it >> > > > >> writes >> > > > >> > with the ack based on the config. Another issue is about the >> > > batching, >> > > > >> how >> > > > >> > do you plan to handle batches mixing records with different >> acks? >> > > > >> > >> > > > >> > An alternative approach may be to define the ack per topic. We >> could >> > > > >> even >> > > > >> > think about defining it on the server side as a topic config. I >> > > haven't >> > > > >> > really thought about it but it may be something to explore a >> bit >> > > more. >> > > > >> > >> > > > >> > Best, >> > > > >> > David >> > > > >> > >> > > > >> > On Wed, Nov 13, 2024 at 3:56 PM Frédérik Rouleau >> > > > >> > <froul...@confluent.io.invalid> wrote: >> > > > >> > >> > > > >> > > Hi TaiJuWu, >> > > > >> > > >> > > > >> > > I find this adding lot's of complexity and I am still not >> > > convinced >> > > > >> by the >> > > > >> > > added value. IMO creating a producer instance per ack level >> is not >> > > > >> > > problematic and the behavior is clear for developers. What >> would >> > > be >> > > > >> the >> > > > >> > > added value of the proposed change ? >> > > > >> > > >> > > > >> > > Regards, >> > > > >> > > >> > > > >> > > >> > > > >> > > On Wed, Nov 6, 2024 at 7:50 AM TaiJu Wu <tjwu1...@gmail.com> >> > > wrote: >> > > > >> > > >> > > > >> > > > Hi Fred and Greg, >> > > > >> > > > >> > > > >> > > > Thanks for your feedback and it really not straightforward >> but >> > > > >> > > interesting! >> > > > >> > > > There are some behavior I expect. >> > > > >> > > > >> > > > >> > > > The current producer uses the *RecordAccumulator* to gather >> > > > >> records, and >> > > > >> > > > the sender thread sends them in batches. We can track each >> > > record’s >> > > > >> > > > acknowledgment setting as it appends to the >> *RecordAccumulator*, >> > > > >> allowing >> > > > >> > > > the *sender *to group batches by acknowledgment levels and >> > > > >> topicPartition >> > > > >> > > > when processing. >> > > > >> > > > >> > > > >> > > > Regarding the statement, "Callbacks for records being sent >> to >> > > the >> > > > >> same >> > > > >> > > > partition are guaranteed to execute in order," this is >> ensured >> > > when >> > > > >> > > > *max.inflight.request >> > > > >> > > > *is set to 1. We can send records with different >> acknowledgment >> > > > >> levels in >> > > > >> > > > the order of acks-0, acks=1, acks=-1. Since we need to send >> > > batches >> > > > >> with >> > > > >> > > > different acknowledgment levels batches to the broker, the >> > > callback >> > > > >> will >> > > > >> > > > execute after each request is completed. >> > > > >> > > > >> > > > >> > > > In response to, "If so, are low-acks records subject to >> > > head-of-line >> > > > >> > > > blocking from high-acks records?," I believe an additional >> > > > >> configuration >> > > > >> > > is >> > > > >> > > > necessary to control this behavior. We could allow records >> to be >> > > > >> either >> > > > >> > > > sync or async, though the callback would still execute >> after >> > > each >> > > > >> batch >> > > > >> > > > with varying acknowledgment levels completes. To measure >> > > behavior >> > > > >> across >> > > > >> > > > acknowledgment levels, we could also include acks in >> > > > >> > > *ProducerIntercepor*. >> > > > >> > > > >> > > > >> > > > Furthermore, before this KIP, a producer could only >> include one >> > > acks >> > > > >> > > level >> > > > >> > > > so sequence is premised. However, with this change, we can >> > > *ONLY* >> > > > >> > > guarantee >> > > > >> > > > the sequence within records of the same acknowledgment >> level >> > > > >> because we >> > > > >> > > may >> > > > >> > > > send up to three separate requests to brokers. >> > > > >> > > > Best, >> > > > >> > > > TaiJuWu >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > TaiJu Wu <tjwu1...@gmail.com> 於 2024年11月6日 週三 上午10:01寫道: >> > > > >> > > > >> > > > >> > > > > Hi Fred and Greg, >> > > > >> > > > > >> > > > >> > > > > Apologies for the delayed response. >> > > > >> > > > > Yes, you’re correct. >> > > > >> > > > > I’ll outline the behavior I expect. >> > > > >> > > > > >> > > > >> > > > > Thanks for your feedback! >> > > > >> > > > > >> > > > >> > > > > Best, >> > > > >> > > > > TaiJuWu >> > > > >> > > > > >> > > > >> > > > > >> > > > >> > > > > Greg Harris <greg.har...@aiven.io.invalid> 於 2024年11月6日 >> 週三 >> > > > >> 上午9:48寫道: >> > > > >> > > > > >> > > > >> > > > >> Hi TaiJuWu, >> > > > >> > > > >> >> > > > >> > > > >> Thanks for the KIP! >> > > > >> > > > >> >> > > > >> > > > >> Can you explain in the KIP about the behavior when the >> > > number of >> > > > >> acks >> > > > >> > > is >> > > > >> > > > >> different for individual records? I think the current >> > > description >> > > > >> > > using >> > > > >> > > > >> the >> > > > >> > > > >> word "straightforward" does little to explain that, and >> may >> > > > >> actually >> > > > >> > > be >> > > > >> > > > >> hiding some complexity. >> > > > >> > > > >> >> > > > >> > > > >> For example, the send() javadoc contains this: >> "Callbacks for >> > > > >> records >> > > > >> > > > >> being >> > > > >> > > > >> sent to the same partition are guaranteed to execute in >> > > order." >> > > > >> Is >> > > > >> > > this >> > > > >> > > > >> still true when acks vary for records within the same >> > > partition? >> > > > >> > > > >> If so, are low-acks records subject to >> head-of-line-blocking >> > > from >> > > > >> > > > >> high-acks >> > > > >> > > > >> records? It seems to me that this feature is useful when >> > > acks is >> > > > >> > > > specified >> > > > >> > > > >> per-topic, but introduces a lot of edge cases that are >> > > > >> underspecified. >> > > > >> > > > >> >> > > > >> > > > >> Thanks, >> > > > >> > > > >> Greg >> > > > >> > > > >> >> > > > >> > > > >> >> > > > >> > > > >> On Tue, Nov 5, 2024 at 4:52 PM TaiJu Wu < >> tjwu1...@gmail.com> >> > > > >> wrote: >> > > > >> > > > >> >> > > > >> > > > >> > Hi Chia-Ping, >> > > > >> > > > >> > >> > > > >> > > > >> > Thanks for your feedback. >> > > > >> > > > >> > I have updated KIP based on your suggestions. >> > > > >> > > > >> > >> > > > >> > > > >> > Best, >> > > > >> > > > >> > Stanley >> > > > >> > > > >> > >> > > > >> > > > >> > Chia-Ping Tsai <chia7...@apache.org> 於 2024年11月5日 週二 >> > > 下午4:41寫道: >> > > > >> > > > >> > >> > > > >> > > > >> > > hi TaiJuWu, >> > > > >> > > > >> > > >> > > > >> > > > >> > > Q0: Could you please add getter (Short acks()) to >> "public >> > > > >> > > interface" >> > > > >> > > > >> > > section? >> > > > >> > > > >> > > >> > > > >> > > > >> > > Q1: Could you please add RPC json reference to prove >> > > "been >> > > > >> > > available >> > > > >> > > > >> at >> > > > >> > > > >> > > the RPC-level," >> > > > >> > > > >> > > >> > > > >> > > > >> > > Q2: Could you please add link to producer docs to >> prove >> > > > >> "share a >> > > > >> > > > >> single >> > > > >> > > > >> > > producer instance across multiple threads" >> > > > >> > > > >> > > >> > > > >> > > > >> > > Thanks, >> > > > >> > > > >> > > Chia-Ping >> > > > >> > > > >> > > >> > > > >> > > > >> > > On 2024/11/05 01:28:36 吳岱儒 wrote: >> > > > >> > > > >> > > > Hi all, >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > I open a KIP-1107: Adding record-level acks for >> > > producers >> > > > >> > > > >> > > > < >> > > > >> > > > >> > > >> > > > >> > > > >> > >> > > > >> > > > >> >> > > > >> > > > >> > > > >> > > >> > > > >> >> > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1107%3A++Adding+record-level+acks+for+producers >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > to >> > > > >> > > > >> > > > reduce the limitation associated with reusing >> > > > >> KafkaProducer. >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > >> > > > >> > > > >> > >> > > > >> > > > >> >> > > > >> > > > >> > > > >> > > >> > > > >> >> > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1107%3A++Adding+record-level+acks+for+producers >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > Feedbacks and suggestions are welcome. >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > Thanks, >> > > > >> > > > >> > > > TaiJuWu >> > > > >> > > > >> > > > >> > > > >> > > > >> > > >> > > > >> > > > >> > >> > > > >> > > > >> >> > > > >> > > > > >> > > > >> > > > >> > > > >> > > >> > > > >> > >> > > > >> >> > > > > >> > > > >> > > >> > >> >