Hi TaiJu! I will echo the concerns about the likelihood of gotchas arising in an effort to work around the existing API and protocol design.
If the central concern is the performance impact and/or resource overhead of multiple client instances, I'd rather attack that in a more direct manner. Thanks, Kirk On Fri, Jan 3, 2025, at 8:03 AM, Divij Vaidya wrote: > Hey TaiJu > > I read the latest version of the KIP. > > I understand the problem you are trying to solve here. But the solution > needs more changes than you proposed and hence, is not straightforward. As > an example, we haven't answered the question about protocol for > ProduceRequest raised above. A `ProduceRequest` defines `ack` at a request > level where the payload consists of records belonging to multiple topics. > One way to solve it is to define topic-level `ack` at the server as > suggested above in this thread, but wouldn't that require us to > remove/deprecate this field? > > Alternatively, have you tried to explore the option of decreasing the > resource footprint of an idle producer so that it is not expensive to > create 3x producers? > Note that there are disadvantages of "vertically scaling" a producer i.e. > reusing a producer with multiple threads. One of the many disadvantages is > that all requests from the producer will be handled by the same network > thread on the broker. If that network thread is busy doing IO for some > reason (perhaps reading from disk is slow), then it will impact all other > requests from that producer. Hence, making producer(s) cheap to create is a > goal worth pursuing. > > -- > Divij Vaidya > > > > On Fri, Jan 3, 2025 at 4:39 AM TaiJu Wu <tjwu1...@gmail.com> wrote: > > > Hello folk, > > > > This thread is pending for a long time, I want to bump this thread and get > > more feedback. > > Any questions are welcome. > > > > Best, > > TaiJuWu > > > > TaiJu Wu <tjwu1...@gmail.com> 於 2024年11月23日 週六 下午9:15寫道: > > > > > Hi Chia-Ping, > > > > > > Sorry for late reply and thanks for your feedback to make this KIP more > > > valuable. > > > After initial verification, I think this can do without large changes. > > > > > > I have updated KIP, thanks a lot. > > > > > > Best, > > > TaiJuWu > > > > > > > > > Chia-Ping Tsai <chia7...@apache.org> 於 2024年11月20日 週三 下午5:06寫道: > > > > > >> hi TaiJuWu > > >> > > >> Is there a possibility to extend this KIP to include topic-level > > >> compression for the producer? This is another issue that prevents us > > from > > >> sharing producers across different threads, as it's common to use > > different > > >> compression types for different topics (data). > > >> > > >> Best, > > >> Chia-Ping > > >> > > >> On 2024/11/18 08:36:25 TaiJu Wu wrote: > > >> > Hi Chia-Ping, > > >> > > > >> > Thanks for your suggestions and feedback. > > >> > > > >> > Q1: I have updated this according your suggestions. > > >> > Q2: This is necessary change since there is a assumption about > > >> > *RecourdAccumulator > > >> > *that all records have same acks(e.g. ProducerConfig.acks) so we need > > >> to a > > >> > method to distinguish which acks belong to each Batch. > > >> > > > >> > Best, > > >> > TaiJuWu > > >> > > > >> > Chia-Ping Tsai <chia7...@apache.org> 於 2024年11月18日 週一 上午2:17寫道: > > >> > > > >> > > hi TaiJuWu > > >> > > > > >> > > Q0: > > >> > > > > >> > > `Format: topic.acks` the dot is acceptable character in topic > > >> naming, so > > >> > > maybe we should reverse the format to "acks.${topic}" to get the > > acks > > >> of > > >> > > topic easily > > >> > > > > >> > > Q1: `Return Map<Acks, List<ProducerBatch>> when > > >> > > RecordAccumulator#drainBatchesForOneNode is called.` > > >> > > > > >> > > this is weird to me, as all we need to do is pass `Map<String, Acks> > > >> to > > >> > > `Sender` and make sure `Sender#sendProduceRequest` add correct acks > > to > > >> > > ProduceRequest, right? > > >> > > > > >> > > Best, > > >> > > Chia-Ping > > >> > > > > >> > > > > >> > > > > >> > > On 2024/11/15 05:12:33 TaiJu Wu wrote: > > >> > > > Hi all, > > >> > > > > > >> > > > I have updated the contents of this KIP > > >> > > > Please take a look and let me know what you think. > > >> > > > > > >> > > > Thanks, > > >> > > > TaiJuWu > > >> > > > > > >> > > > On Thu, Nov 14, 2024 at 2:21 PM TaiJu Wu <tjwu1...@gmail.com> > > >> wrote: > > >> > > > > > >> > > > > Hi all, > > >> > > > > > > >> > > > > Thanks for your feeback and @Chia-Ping's help. > > >> > > > > . > > >> > > > > I also agree topic-level acks config is more reasonable and it > > can > > >> > > simply > > >> > > > > the story. > > >> > > > > When I try implementing record-level acks, I notice I don't have > > >> good > > >> > > idea > > >> > > > > to avoid iterating batches for get partition information (need > > by > > >> > > > > *RecordAccumulator#partitionChanged*). > > >> > > > > > > >> > > > > Back to the init question how can I handle different acks for > > >> batches: > > >> > > > > First, we can attach *topic-level acks *to > > >> > > *RecordAccumulator#TopicInfo*. > > >> > > > > Second, we can return *Map<Acks, List<ProducerBatch>>* when > > >> > > *RecordAccumulator#drainBatchesForOneNode > > >> > > > > *is called. In this step, we can propagate acks to *sender*. > > >> > > > > Finally, we can get the acks info and group same acks into a > > >> > > > > *List<ProducerBatch>>* for a node in > > *sender#sendProduceRequests*. > > >> > > > > > > >> > > > > If I missed something or there is any mistake, please let me > > know. > > >> > > > > I will update this KIP later, thank your feedback. > > >> > > > > > > >> > > > > Best, > > >> > > > > TaiJuWu > > >> > > > > > > >> > > > > > > >> > > > > Chia-Ping Tsai <chia7...@apache.org> 於 2024年11月14日 週四 上午9:46寫道: > > >> > > > > > > >> > > > >> hi All > > >> > > > >> > > >> > > > >> This KIP is based on our use case where an edge application > > with > > >> many > > >> > > > >> sensors wants to use a single producer to deliver ‘few but > > >> varied’ > > >> > > records > > >> > > > >> with different acks settings. The reason for using a single > > >> producer > > >> > > is to > > >> > > > >> minimize resource usage on edge devices with limited hardware > > >> > > capabilities. > > >> > > > >> Currently, we use a producer pool to handle different acks > > >> values, > > >> > > which > > >> > > > >> requires 3x producer instances. Additionally, this approach > > >> creates > > >> > > many > > >> > > > >> idle producers if a sensor with a specific acks setting has no > > >> data > > >> > > for a > > >> > > > >> while. > > >> > > > >> > > >> > > > >> I love David’s suggestion since the acks configuration is > > closely > > >> > > related > > >> > > > >> to the topic. Maybe we can introduce an optional configuration > > >> in the > > >> > > > >> producer to define topic-level acks, with the existing acks > > >> being the > > >> > > > >> default for all topics. This approach is not only simple but > > also > > >> > > easy to > > >> > > > >> understand and implement. > > >> > > > >> > > >> > > > >> Best, > > >> > > > >> Chia-Ping > > >> > > > >> > > >> > > > >> On 2024/11/13 16:04:24 Andrew Schofield wrote: > > >> > > > >> > Hi TaiJuWu, > > >> > > > >> > I've been thinking for a while about this KIP before jumping > > >> into > > >> > > the > > >> > > > >> discussion. > > >> > > > >> > > > >> > > > >> > I'm afraid that I don't think the approach in the KIP is the > > >> best, > > >> > > > >> given the design > > >> > > > >> > of the Kafka protocol in this area. Essentially, each Produce > > >> > > request > > >> > > > >> contains > > >> > > > >> > the acks value at the top level, and may contain records for > > >> many > > >> > > > >> topics or > > >> > > > >> > partitions. My point is that batching occurs at the level of > > a > > >> > > Produce > > >> > > > >> request, > > >> > > > >> > so changing the acks value between records will require a new > > >> > > Produce > > >> > > > >> request > > >> > > > >> > to be sent. There would likely be an efficiency penalty if > > this > > >> > > feature > > >> > > > >> was used > > >> > > > >> > heavily with the acks changing record by record. > > >> > > > >> > > > >> > > > >> > I can see that potentially an application might want > > different > > >> ack > > >> > > > >> levels for > > >> > > > >> > different topics, but I would be surprised if they use > > >> different ack > > >> > > > >> levels within > > >> > > > >> > the same topic. Maybe David's suggestion of defining the acks > > >> per > > >> > > topic > > >> > > > >> > would be enough. What do you think? > > >> > > > >> > > > >> > > > >> > Thanks, > > >> > > > >> > Andrew > > >> > > > >> > ________________________________________ > > >> > > > >> > From: David Jacot <dja...@confluent.io.INVALID> > > >> > > > >> > Sent: 13 November 2024 15:31 > > >> > > > >> > To: dev@kafka.apache.org <dev@kafka.apache.org> > > >> > > > >> > Subject: Re: [DISCUSS]KIP-1107: Adding record-level acks for > > >> > > producers > > >> > > > >> > > > >> > > > >> > Hi TaiJuWu, > > >> > > > >> > > > >> > > > >> > Thanks for the KIP. > > >> > > > >> > > > >> > > > >> > The motivation is not clear to me. Could you please elaborate > > >> a bit > > >> > > > >> more on > > >> > > > >> > it? > > >> > > > >> > > > >> > > > >> > My concern is that it adds a lot of complexity and the added > > >> value > > >> > > > >> seems to > > >> > > > >> > be low. Moreover, it will make reasoning about an application > > >> from > > >> > > the > > >> > > > >> > server side more difficult because we can no longer assume > > >> that it > > >> > > > >> writes > > >> > > > >> > with the ack based on the config. Another issue is about the > > >> > > batching, > > >> > > > >> how > > >> > > > >> > do you plan to handle batches mixing records with different > > >> acks? > > >> > > > >> > > > >> > > > >> > An alternative approach may be to define the ack per topic. > > We > > >> could > > >> > > > >> even > > >> > > > >> > think about defining it on the server side as a topic > > config. I > > >> > > haven't > > >> > > > >> > really thought about it but it may be something to explore a > > >> bit > > >> > > more. > > >> > > > >> > > > >> > > > >> > Best, > > >> > > > >> > David > > >> > > > >> > > > >> > > > >> > On Wed, Nov 13, 2024 at 3:56 PM Frédérik Rouleau > > >> > > > >> > <froul...@confluent.io.invalid> wrote: > > >> > > > >> > > > >> > > > >> > > Hi TaiJuWu, > > >> > > > >> > > > > >> > > > >> > > I find this adding lot's of complexity and I am still not > > >> > > convinced > > >> > > > >> by the > > >> > > > >> > > added value. IMO creating a producer instance per ack level > > >> is not > > >> > > > >> > > problematic and the behavior is clear for developers. What > > >> would > > >> > > be > > >> > > > >> the > > >> > > > >> > > added value of the proposed change ? > > >> > > > >> > > > > >> > > > >> > > Regards, > > >> > > > >> > > > > >> > > > >> > > > > >> > > > >> > > On Wed, Nov 6, 2024 at 7:50 AM TaiJu Wu < > > tjwu1...@gmail.com> > > >> > > wrote: > > >> > > > >> > > > > >> > > > >> > > > Hi Fred and Greg, > > >> > > > >> > > > > > >> > > > >> > > > Thanks for your feedback and it really not > > straightforward > > >> but > > >> > > > >> > > interesting! > > >> > > > >> > > > There are some behavior I expect. > > >> > > > >> > > > > > >> > > > >> > > > The current producer uses the *RecordAccumulator* to > > gather > > >> > > > >> records, and > > >> > > > >> > > > the sender thread sends them in batches. We can track > > each > > >> > > record’s > > >> > > > >> > > > acknowledgment setting as it appends to the > > >> *RecordAccumulator*, > > >> > > > >> allowing > > >> > > > >> > > > the *sender *to group batches by acknowledgment levels > > and > > >> > > > >> topicPartition > > >> > > > >> > > > when processing. > > >> > > > >> > > > > > >> > > > >> > > > Regarding the statement, "Callbacks for records being > > sent > > >> to > > >> > > the > > >> > > > >> same > > >> > > > >> > > > partition are guaranteed to execute in order," this is > > >> ensured > > >> > > when > > >> > > > >> > > > *max.inflight.request > > >> > > > >> > > > *is set to 1. We can send records with different > > >> acknowledgment > > >> > > > >> levels in > > >> > > > >> > > > the order of acks-0, acks=1, acks=-1. Since we need to > > send > > >> > > batches > > >> > > > >> with > > >> > > > >> > > > different acknowledgment levels batches to the broker, > > the > > >> > > callback > > >> > > > >> will > > >> > > > >> > > > execute after each request is completed. > > >> > > > >> > > > > > >> > > > >> > > > In response to, "If so, are low-acks records subject to > > >> > > head-of-line > > >> > > > >> > > > blocking from high-acks records?," I believe an > > additional > > >> > > > >> configuration > > >> > > > >> > > is > > >> > > > >> > > > necessary to control this behavior. We could allow > > records > > >> to be > > >> > > > >> either > > >> > > > >> > > > sync or async, though the callback would still execute > > >> after > > >> > > each > > >> > > > >> batch > > >> > > > >> > > > with varying acknowledgment levels completes. To measure > > >> > > behavior > > >> > > > >> across > > >> > > > >> > > > acknowledgment levels, we could also include acks in > > >> > > > >> > > *ProducerIntercepor*. > > >> > > > >> > > > > > >> > > > >> > > > Furthermore, before this KIP, a producer could only > > >> include one > > >> > > acks > > >> > > > >> > > level > > >> > > > >> > > > so sequence is premised. However, with this change, we > > can > > >> > > *ONLY* > > >> > > > >> > > guarantee > > >> > > > >> > > > the sequence within records of the same acknowledgment > > >> level > > >> > > > >> because we > > >> > > > >> > > may > > >> > > > >> > > > send up to three separate requests to brokers. > > >> > > > >> > > > Best, > > >> > > > >> > > > TaiJuWu > > >> > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > > > TaiJu Wu <tjwu1...@gmail.com> 於 2024年11月6日 週三 上午10:01寫道: > > >> > > > >> > > > > > >> > > > >> > > > > Hi Fred and Greg, > > >> > > > >> > > > > > > >> > > > >> > > > > Apologies for the delayed response. > > >> > > > >> > > > > Yes, you’re correct. > > >> > > > >> > > > > I’ll outline the behavior I expect. > > >> > > > >> > > > > > > >> > > > >> > > > > Thanks for your feedback! > > >> > > > >> > > > > > > >> > > > >> > > > > Best, > > >> > > > >> > > > > TaiJuWu > > >> > > > >> > > > > > > >> > > > >> > > > > > > >> > > > >> > > > > Greg Harris <greg.har...@aiven.io.invalid> 於 > > 2024年11月6日 > > >> 週三 > > >> > > > >> 上午9:48寫道: > > >> > > > >> > > > > > > >> > > > >> > > > >> Hi TaiJuWu, > > >> > > > >> > > > >> > > >> > > > >> > > > >> Thanks for the KIP! > > >> > > > >> > > > >> > > >> > > > >> > > > >> Can you explain in the KIP about the behavior when the > > >> > > number of > > >> > > > >> acks > > >> > > > >> > > is > > >> > > > >> > > > >> different for individual records? I think the current > > >> > > description > > >> > > > >> > > using > > >> > > > >> > > > >> the > > >> > > > >> > > > >> word "straightforward" does little to explain that, > > and > > >> may > > >> > > > >> actually > > >> > > > >> > > be > > >> > > > >> > > > >> hiding some complexity. > > >> > > > >> > > > >> > > >> > > > >> > > > >> For example, the send() javadoc contains this: > > >> "Callbacks for > > >> > > > >> records > > >> > > > >> > > > >> being > > >> > > > >> > > > >> sent to the same partition are guaranteed to execute > > in > > >> > > order." > > >> > > > >> Is > > >> > > > >> > > this > > >> > > > >> > > > >> still true when acks vary for records within the same > > >> > > partition? > > >> > > > >> > > > >> If so, are low-acks records subject to > > >> head-of-line-blocking > > >> > > from > > >> > > > >> > > > >> high-acks > > >> > > > >> > > > >> records? It seems to me that this feature is useful > > when > > >> > > acks is > > >> > > > >> > > > specified > > >> > > > >> > > > >> per-topic, but introduces a lot of edge cases that are > > >> > > > >> underspecified. > > >> > > > >> > > > >> > > >> > > > >> > > > >> Thanks, > > >> > > > >> > > > >> Greg > > >> > > > >> > > > >> > > >> > > > >> > > > >> > > >> > > > >> > > > >> On Tue, Nov 5, 2024 at 4:52 PM TaiJu Wu < > > >> tjwu1...@gmail.com> > > >> > > > >> wrote: > > >> > > > >> > > > >> > > >> > > > >> > > > >> > Hi Chia-Ping, > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > Thanks for your feedback. > > >> > > > >> > > > >> > I have updated KIP based on your suggestions. > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > Best, > > >> > > > >> > > > >> > Stanley > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > Chia-Ping Tsai <chia7...@apache.org> 於 2024年11月5日 > > 週二 > > >> > > 下午4:41寫道: > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > hi TaiJuWu, > > >> > > > >> > > > >> > > > > >> > > > >> > > > >> > > Q0: Could you please add getter (Short acks()) to > > >> "public > > >> > > > >> > > interface" > > >> > > > >> > > > >> > > section? > > >> > > > >> > > > >> > > > > >> > > > >> > > > >> > > Q1: Could you please add RPC json reference to > > prove > > >> > > "been > > >> > > > >> > > available > > >> > > > >> > > > >> at > > >> > > > >> > > > >> > > the RPC-level," > > >> > > > >> > > > >> > > > > >> > > > >> > > > >> > > Q2: Could you please add link to producer docs to > > >> prove > > >> > > > >> "share a > > >> > > > >> > > > >> single > > >> > > > >> > > > >> > > producer instance across multiple threads" > > >> > > > >> > > > >> > > > > >> > > > >> > > > >> > > Thanks, > > >> > > > >> > > > >> > > Chia-Ping > > >> > > > >> > > > >> > > > > >> > > > >> > > > >> > > On 2024/11/05 01:28:36 吳岱儒 wrote: > > >> > > > >> > > > >> > > > Hi all, > > >> > > > >> > > > >> > > > > > >> > > > >> > > > >> > > > I open a KIP-1107: Adding record-level acks for > > >> > > producers > > >> > > > >> > > > >> > > > < > > >> > > > >> > > > >> > > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > >> > > > >> > > > > > >> > > > >> > > > > >> > > > >> > > >> > > > > >> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1107%3A++Adding+record-level+acks+for+producers > > >> > > > >> > > > >> > > > > > >> > > > >> > > > >> > > > to > > >> > > > >> > > > >> > > > reduce the limitation associated with reusing > > >> > > > >> KafkaProducer. > > >> > > > >> > > > >> > > > > > >> > > > >> > > > >> > > > > > >> > > > >> > > > >> > > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > >> > > > >> > > > > > >> > > > >> > > > > >> > > > >> > > >> > > > > >> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1107%3A++Adding+record-level+acks+for+producers > > >> > > > >> > > > >> > > > > > >> > > > >> > > > >> > > > Feedbacks and suggestions are welcome. > > >> > > > >> > > > >> > > > > > >> > > > >> > > > >> > > > Thanks, > > >> > > > >> > > > >> > > > TaiJuWu > > >> > > > >> > > > >> > > > > > >> > > > >> > > > >> > > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > >> > > > >> > > > > > > >> > > > >> > > > > > >> > > > >> > > > > >> > > > >> > > > >> > > > >> > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > > > >