hi TaiJuWu Is there a possibility to extend this KIP to include topic-level compression for the producer? This is another issue that prevents us from sharing producers across different threads, as it's common to use different compression types for different topics (data).
Best, Chia-Ping On 2024/11/18 08:36:25 TaiJu Wu wrote: > Hi Chia-Ping, > > Thanks for your suggestions and feedback. > > Q1: I have updated this according your suggestions. > Q2: This is necessary change since there is a assumption about > *RecourdAccumulator > *that all records have same acks(e.g. ProducerConfig.acks) so we need to a > method to distinguish which acks belong to each Batch. > > Best, > TaiJuWu > > Chia-Ping Tsai <chia7...@apache.org> 於 2024年11月18日 週一 上午2:17寫道: > > > hi TaiJuWu > > > > Q0: > > > > `Format: topic.acks` the dot is acceptable character in topic naming, so > > maybe we should reverse the format to "acks.${topic}" to get the acks of > > topic easily > > > > Q1: `Return Map<Acks, List<ProducerBatch>> when > > RecordAccumulator#drainBatchesForOneNode is called.` > > > > this is weird to me, as all we need to do is pass `Map<String, Acks> to > > `Sender` and make sure `Sender#sendProduceRequest` add correct acks to > > ProduceRequest, right? > > > > Best, > > Chia-Ping > > > > > > > > On 2024/11/15 05:12:33 TaiJu Wu wrote: > > > Hi all, > > > > > > I have updated the contents of this KIP > > > Please take a look and let me know what you think. > > > > > > Thanks, > > > TaiJuWu > > > > > > On Thu, Nov 14, 2024 at 2:21 PM TaiJu Wu <tjwu1...@gmail.com> wrote: > > > > > > > Hi all, > > > > > > > > Thanks for your feeback and @Chia-Ping's help. > > > > . > > > > I also agree topic-level acks config is more reasonable and it can > > simply > > > > the story. > > > > When I try implementing record-level acks, I notice I don't have good > > idea > > > > to avoid iterating batches for get partition information (need by > > > > *RecordAccumulator#partitionChanged*). > > > > > > > > Back to the init question how can I handle different acks for batches: > > > > First, we can attach *topic-level acks *to > > *RecordAccumulator#TopicInfo*. > > > > Second, we can return *Map<Acks, List<ProducerBatch>>* when > > *RecordAccumulator#drainBatchesForOneNode > > > > *is called. In this step, we can propagate acks to *sender*. > > > > Finally, we can get the acks info and group same acks into a > > > > *List<ProducerBatch>>* for a node in *sender#sendProduceRequests*. > > > > > > > > If I missed something or there is any mistake, please let me know. > > > > I will update this KIP later, thank your feedback. > > > > > > > > Best, > > > > TaiJuWu > > > > > > > > > > > > Chia-Ping Tsai <chia7...@apache.org> 於 2024年11月14日 週四 上午9:46寫道: > > > > > > > >> hi All > > > >> > > > >> This KIP is based on our use case where an edge application with many > > > >> sensors wants to use a single producer to deliver ‘few but varied’ > > records > > > >> with different acks settings. The reason for using a single producer > > is to > > > >> minimize resource usage on edge devices with limited hardware > > capabilities. > > > >> Currently, we use a producer pool to handle different acks values, > > which > > > >> requires 3x producer instances. Additionally, this approach creates > > many > > > >> idle producers if a sensor with a specific acks setting has no data > > for a > > > >> while. > > > >> > > > >> I love David’s suggestion since the acks configuration is closely > > related > > > >> to the topic. Maybe we can introduce an optional configuration in the > > > >> producer to define topic-level acks, with the existing acks being the > > > >> default for all topics. This approach is not only simple but also > > easy to > > > >> understand and implement. > > > >> > > > >> Best, > > > >> Chia-Ping > > > >> > > > >> On 2024/11/13 16:04:24 Andrew Schofield wrote: > > > >> > Hi TaiJuWu, > > > >> > I've been thinking for a while about this KIP before jumping into > > the > > > >> discussion. > > > >> > > > > >> > I'm afraid that I don't think the approach in the KIP is the best, > > > >> given the design > > > >> > of the Kafka protocol in this area. Essentially, each Produce > > request > > > >> contains > > > >> > the acks value at the top level, and may contain records for many > > > >> topics or > > > >> > partitions. My point is that batching occurs at the level of a > > Produce > > > >> request, > > > >> > so changing the acks value between records will require a new > > Produce > > > >> request > > > >> > to be sent. There would likely be an efficiency penalty if this > > feature > > > >> was used > > > >> > heavily with the acks changing record by record. > > > >> > > > > >> > I can see that potentially an application might want different ack > > > >> levels for > > > >> > different topics, but I would be surprised if they use different ack > > > >> levels within > > > >> > the same topic. Maybe David's suggestion of defining the acks per > > topic > > > >> > would be enough. What do you think? > > > >> > > > > >> > Thanks, > > > >> > Andrew > > > >> > ________________________________________ > > > >> > From: David Jacot <dja...@confluent.io.INVALID> > > > >> > Sent: 13 November 2024 15:31 > > > >> > To: dev@kafka.apache.org <dev@kafka.apache.org> > > > >> > Subject: Re: [DISCUSS]KIP-1107: Adding record-level acks for > > producers > > > >> > > > > >> > Hi TaiJuWu, > > > >> > > > > >> > Thanks for the KIP. > > > >> > > > > >> > The motivation is not clear to me. Could you please elaborate a bit > > > >> more on > > > >> > it? > > > >> > > > > >> > My concern is that it adds a lot of complexity and the added value > > > >> seems to > > > >> > be low. Moreover, it will make reasoning about an application from > > the > > > >> > server side more difficult because we can no longer assume that it > > > >> writes > > > >> > with the ack based on the config. Another issue is about the > > batching, > > > >> how > > > >> > do you plan to handle batches mixing records with different acks? > > > >> > > > > >> > An alternative approach may be to define the ack per topic. We could > > > >> even > > > >> > think about defining it on the server side as a topic config. I > > haven't > > > >> > really thought about it but it may be something to explore a bit > > more. > > > >> > > > > >> > Best, > > > >> > David > > > >> > > > > >> > On Wed, Nov 13, 2024 at 3:56 PM Frédérik Rouleau > > > >> > <froul...@confluent.io.invalid> wrote: > > > >> > > > > >> > > Hi TaiJuWu, > > > >> > > > > > >> > > I find this adding lot's of complexity and I am still not > > convinced > > > >> by the > > > >> > > added value. IMO creating a producer instance per ack level is not > > > >> > > problematic and the behavior is clear for developers. What would > > be > > > >> the > > > >> > > added value of the proposed change ? > > > >> > > > > > >> > > Regards, > > > >> > > > > > >> > > > > > >> > > On Wed, Nov 6, 2024 at 7:50 AM TaiJu Wu <tjwu1...@gmail.com> > > wrote: > > > >> > > > > > >> > > > Hi Fred and Greg, > > > >> > > > > > > >> > > > Thanks for your feedback and it really not straightforward but > > > >> > > interesting! > > > >> > > > There are some behavior I expect. > > > >> > > > > > > >> > > > The current producer uses the *RecordAccumulator* to gather > > > >> records, and > > > >> > > > the sender thread sends them in batches. We can track each > > record’s > > > >> > > > acknowledgment setting as it appends to the *RecordAccumulator*, > > > >> allowing > > > >> > > > the *sender *to group batches by acknowledgment levels and > > > >> topicPartition > > > >> > > > when processing. > > > >> > > > > > > >> > > > Regarding the statement, "Callbacks for records being sent to > > the > > > >> same > > > >> > > > partition are guaranteed to execute in order," this is ensured > > when > > > >> > > > *max.inflight.request > > > >> > > > *is set to 1. We can send records with different acknowledgment > > > >> levels in > > > >> > > > the order of acks-0, acks=1, acks=-1. Since we need to send > > batches > > > >> with > > > >> > > > different acknowledgment levels batches to the broker, the > > callback > > > >> will > > > >> > > > execute after each request is completed. > > > >> > > > > > > >> > > > In response to, "If so, are low-acks records subject to > > head-of-line > > > >> > > > blocking from high-acks records?," I believe an additional > > > >> configuration > > > >> > > is > > > >> > > > necessary to control this behavior. We could allow records to be > > > >> either > > > >> > > > sync or async, though the callback would still execute after > > each > > > >> batch > > > >> > > > with varying acknowledgment levels completes. To measure > > behavior > > > >> across > > > >> > > > acknowledgment levels, we could also include acks in > > > >> > > *ProducerIntercepor*. > > > >> > > > > > > >> > > > Furthermore, before this KIP, a producer could only include one > > acks > > > >> > > level > > > >> > > > so sequence is premised. However, with this change, we can > > *ONLY* > > > >> > > guarantee > > > >> > > > the sequence within records of the same acknowledgment level > > > >> because we > > > >> > > may > > > >> > > > send up to three separate requests to brokers. > > > >> > > > Best, > > > >> > > > TaiJuWu > > > >> > > > > > > >> > > > > > > >> > > > TaiJu Wu <tjwu1...@gmail.com> 於 2024年11月6日 週三 上午10:01寫道: > > > >> > > > > > > >> > > > > Hi Fred and Greg, > > > >> > > > > > > > >> > > > > Apologies for the delayed response. > > > >> > > > > Yes, you’re correct. > > > >> > > > > I’ll outline the behavior I expect. > > > >> > > > > > > > >> > > > > Thanks for your feedback! > > > >> > > > > > > > >> > > > > Best, > > > >> > > > > TaiJuWu > > > >> > > > > > > > >> > > > > > > > >> > > > > Greg Harris <greg.har...@aiven.io.invalid> 於 2024年11月6日 週三 > > > >> 上午9:48寫道: > > > >> > > > > > > > >> > > > >> Hi TaiJuWu, > > > >> > > > >> > > > >> > > > >> Thanks for the KIP! > > > >> > > > >> > > > >> > > > >> Can you explain in the KIP about the behavior when the > > number of > > > >> acks > > > >> > > is > > > >> > > > >> different for individual records? I think the current > > description > > > >> > > using > > > >> > > > >> the > > > >> > > > >> word "straightforward" does little to explain that, and may > > > >> actually > > > >> > > be > > > >> > > > >> hiding some complexity. > > > >> > > > >> > > > >> > > > >> For example, the send() javadoc contains this: "Callbacks for > > > >> records > > > >> > > > >> being > > > >> > > > >> sent to the same partition are guaranteed to execute in > > order." > > > >> Is > > > >> > > this > > > >> > > > >> still true when acks vary for records within the same > > partition? > > > >> > > > >> If so, are low-acks records subject to head-of-line-blocking > > from > > > >> > > > >> high-acks > > > >> > > > >> records? It seems to me that this feature is useful when > > acks is > > > >> > > > specified > > > >> > > > >> per-topic, but introduces a lot of edge cases that are > > > >> underspecified. > > > >> > > > >> > > > >> > > > >> Thanks, > > > >> > > > >> Greg > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> On Tue, Nov 5, 2024 at 4:52 PM TaiJu Wu <tjwu1...@gmail.com> > > > >> wrote: > > > >> > > > >> > > > >> > > > >> > Hi Chia-Ping, > > > >> > > > >> > > > > >> > > > >> > Thanks for your feedback. > > > >> > > > >> > I have updated KIP based on your suggestions. > > > >> > > > >> > > > > >> > > > >> > Best, > > > >> > > > >> > Stanley > > > >> > > > >> > > > > >> > > > >> > Chia-Ping Tsai <chia7...@apache.org> 於 2024年11月5日 週二 > > 下午4:41寫道: > > > >> > > > >> > > > > >> > > > >> > > hi TaiJuWu, > > > >> > > > >> > > > > > >> > > > >> > > Q0: Could you please add getter (Short acks()) to "public > > > >> > > interface" > > > >> > > > >> > > section? > > > >> > > > >> > > > > > >> > > > >> > > Q1: Could you please add RPC json reference to prove > > "been > > > >> > > available > > > >> > > > >> at > > > >> > > > >> > > the RPC-level," > > > >> > > > >> > > > > > >> > > > >> > > Q2: Could you please add link to producer docs to prove > > > >> "share a > > > >> > > > >> single > > > >> > > > >> > > producer instance across multiple threads" > > > >> > > > >> > > > > > >> > > > >> > > Thanks, > > > >> > > > >> > > Chia-Ping > > > >> > > > >> > > > > > >> > > > >> > > On 2024/11/05 01:28:36 吳岱儒 wrote: > > > >> > > > >> > > > Hi all, > > > >> > > > >> > > > > > > >> > > > >> > > > I open a KIP-1107: Adding record-level acks for > > producers > > > >> > > > >> > > > < > > > >> > > > >> > > > > > >> > > > >> > > > > >> > > > >> > > > >> > > > > > > >> > > > > > >> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1107%3A++Adding+record-level+acks+for+producers > > > >> > > > >> > > > > > > >> > > > >> > > > to > > > >> > > > >> > > > reduce the limitation associated with reusing > > > >> KafkaProducer. > > > >> > > > >> > > > > > > >> > > > >> > > > > > > >> > > > >> > > > > > >> > > > >> > > > > >> > > > >> > > > >> > > > > > > >> > > > > > >> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1107%3A++Adding+record-level+acks+for+producers > > > >> > > > >> > > > > > > >> > > > >> > > > Feedbacks and suggestions are welcome. > > > >> > > > >> > > > > > > >> > > > >> > > > Thanks, > > > >> > > > >> > > > TaiJuWu > > > >> > > > >> > > > > > > >> > > > >> > > > > > >> > > > >> > > > > >> > > > >> > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > > > > >