Hi Divij and Kirk,

Thanks for your response.
You are right, this change is not straightforward and I apologize for that.

> we haven't answered the question about protocol for ProduceRequest raised
above.
Sorry but which question did I miss, this KIP has been modified from
record-level to topic-level.

> Note that there are disadvantages of "vertically scaling" a producer i.e.
> reusing a producer with multiple threads.
This change is optional so users can choose to adopt it. If they don't want
to use this, it would not have any impact.

> making producer(s) cheap to create is a goal worth pursuing.
> I'd rather attack that in a more direct manner
Thanks for your suggestion, I will investigate this approach simultaneously.

Best,
TaiJuWu

Kirk True <k...@kirktrue.pro> 於 2025年1月7日 週二 上午8:34寫道:

> Hi TaiJu!
>
> I will echo the concerns about the likelihood of gotchas arising in an
> effort to work around the existing API and protocol design.
>
> If the central concern is the performance impact and/or resource overhead
> of multiple client instances, I'd rather attack that in a more direct
> manner.
>
> Thanks,
> Kirk
>
> On Fri, Jan 3, 2025, at 8:03 AM, Divij Vaidya wrote:
> > Hey TaiJu
> >
> > I read the latest version of the KIP.
> >
> > I understand the problem you are trying to solve here. But the solution
> > needs more changes than you proposed and hence, is not straightforward.
> As
> > an example, we haven't answered the question about protocol for
> > ProduceRequest raised above. A `ProduceRequest` defines `ack` at a
> request
> > level where the payload consists of records belonging to multiple topics.
> > One way to solve it is to define topic-level `ack` at the server as
> > suggested above in this thread, but wouldn't that require us to
> > remove/deprecate this field?
> >
> > Alternatively, have you tried to explore the option of decreasing the
> > resource footprint of an idle producer so that it is not expensive to
> > create 3x producers?
> > Note that there are disadvantages of "vertically scaling" a producer i.e.
> > reusing a producer with multiple threads. One of the many disadvantages
> is
> > that all requests from the producer will be handled by the same network
> > thread on the broker. If that network thread is busy doing IO for some
> > reason (perhaps reading from disk is slow), then it will impact all other
> > requests from that producer. Hence, making producer(s) cheap to create
> is a
> > goal worth pursuing.
> >
> > --
> > Divij Vaidya
> >
> >
> >
> > On Fri, Jan 3, 2025 at 4:39 AM TaiJu Wu <tjwu1...@gmail.com> wrote:
> >
> > > Hello folk,
> > >
> > > This thread is pending for a long time, I want to bump this thread and
> get
> > > more feedback.
> > > Any questions are welcome.
> > >
> > > Best,
> > > TaiJuWu
> > >
> > > TaiJu Wu <tjwu1...@gmail.com> 於 2024年11月23日 週六 下午9:15寫道:
> > >
> > > > Hi Chia-Ping,
> > > >
> > > > Sorry for late reply and thanks for your feedback to make this KIP
> more
> > > > valuable.
> > > > After initial verification, I think this can do without large
> changes.
> > > >
> > > > I have updated KIP, thanks a lot.
> > > >
> > > > Best,
> > > > TaiJuWu
> > > >
> > > >
> > > > Chia-Ping Tsai <chia7...@apache.org> 於 2024年11月20日 週三 下午5:06寫道:
> > > >
> > > >> hi TaiJuWu
> > > >>
> > > >> Is there a possibility to extend this KIP to include topic-level
> > > >> compression for the producer? This is another issue that prevents us
> > > from
> > > >> sharing producers across different threads, as it's common to use
> > > different
> > > >> compression types for different topics (data).
> > > >>
> > > >> Best,
> > > >> Chia-Ping
> > > >>
> > > >> On 2024/11/18 08:36:25 TaiJu Wu wrote:
> > > >> > Hi Chia-Ping,
> > > >> >
> > > >> > Thanks for your suggestions and feedback.
> > > >> >
> > > >> > Q1: I have updated this according your suggestions.
> > > >> > Q2: This is necessary change since there is a assumption about
> > > >> > *RecourdAccumulator
> > > >> > *that all records have same acks(e.g. ProducerConfig.acks) so we
> need
> > > >> to a
> > > >> > method to distinguish which acks belong to each Batch.
> > > >> >
> > > >> > Best,
> > > >> > TaiJuWu
> > > >> >
> > > >> > Chia-Ping Tsai <chia7...@apache.org> 於 2024年11月18日 週一 上午2:17寫道:
> > > >> >
> > > >> > > hi TaiJuWu
> > > >> > >
> > > >> > > Q0:
> > > >> > >
> > > >> > > `Format: topic.acks`  the dot is acceptable character in topic
> > > >> naming, so
> > > >> > > maybe we should reverse the format to "acks.${topic}" to get the
> > > acks
> > > >> of
> > > >> > > topic easily
> > > >> > >
> > > >> > > Q1: `Return Map<Acks, List<ProducerBatch>> when
> > > >> > > RecordAccumulator#drainBatchesForOneNode is called.`
> > > >> > >
> > > >> > > this is weird to me, as all we need to do is pass `Map<String,
> Acks>
> > > >> to
> > > >> > > `Sender` and make sure `Sender#sendProduceRequest` add correct
> acks
> > > to
> > > >> > > ProduceRequest, right?
> > > >> > >
> > > >> > > Best,
> > > >> > > Chia-Ping
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > On 2024/11/15 05:12:33 TaiJu Wu wrote:
> > > >> > > > Hi all,
> > > >> > > >
> > > >> > > > I have updated the contents of this KIP
> > > >> > > > Please take a look and let me know what you think.
> > > >> > > >
> > > >> > > > Thanks,
> > > >> > > > TaiJuWu
> > > >> > > >
> > > >> > > > On Thu, Nov 14, 2024 at 2:21 PM TaiJu Wu <tjwu1...@gmail.com>
> > > >> wrote:
> > > >> > > >
> > > >> > > > > Hi all,
> > > >> > > > >
> > > >> > > > > Thanks for your feeback and @Chia-Ping's help.
> > > >> > > > > .
> > > >> > > > > I also agree topic-level acks config is more reasonable and
> it
> > > can
> > > >> > > simply
> > > >> > > > > the story.
> > > >> > > > > When I try implementing record-level acks, I notice I don't
> have
> > > >> good
> > > >> > > idea
> > > >> > > > > to avoid iterating batches for get partition information
> (need
> > > by
> > > >> > > > > *RecordAccumulator#partitionChanged*).
> > > >> > > > >
> > > >> > > > > Back to the init question how can I handle different acks
> for
> > > >> batches:
> > > >> > > > > First, we can attach *topic-level acks *to
> > > >> > > *RecordAccumulator#TopicInfo*.
> > > >> > > > > Second,  we can return *Map<Acks, List<ProducerBatch>>* when
> > > >> > > *RecordAccumulator#drainBatchesForOneNode
> > > >> > > > > *is called. In this step, we can propagate acks to *sender*.
> > > >> > > > > Finally, we can get the acks info and group same acks into a
> > > >> > > > > *List<ProducerBatch>>* for a node in
> > > *sender#sendProduceRequests*.
> > > >> > > > >
> > > >> > > > > If I missed something or there is any mistake, please let me
> > > know.
> > > >> > > > > I will update this KIP later, thank your feedback.
> > > >> > > > >
> > > >> > > > > Best,
> > > >> > > > > TaiJuWu
> > > >> > > > >
> > > >> > > > >
> > > >> > > > > Chia-Ping Tsai <chia7...@apache.org> 於 2024年11月14日 週四
> 上午9:46寫道:
> > > >> > > > >
> > > >> > > > >> hi All
> > > >> > > > >>
> > > >> > > > >> This KIP is based on our use case where an edge application
> > > with
> > > >> many
> > > >> > > > >> sensors wants to use a single producer to deliver ‘few but
> > > >> varied’
> > > >> > > records
> > > >> > > > >> with different acks settings. The reason for using a single
> > > >> producer
> > > >> > > is to
> > > >> > > > >> minimize resource usage on edge devices with limited
> hardware
> > > >> > > capabilities.
> > > >> > > > >> Currently, we use a producer pool to handle different acks
> > > >> values,
> > > >> > > which
> > > >> > > > >> requires 3x producer instances. Additionally, this approach
> > > >> creates
> > > >> > > many
> > > >> > > > >> idle producers if a sensor with a specific acks setting
> has no
> > > >> data
> > > >> > > for a
> > > >> > > > >> while.
> > > >> > > > >>
> > > >> > > > >> I love David’s suggestion since the acks configuration is
> > > closely
> > > >> > > related
> > > >> > > > >> to the topic. Maybe we can introduce an optional
> configuration
> > > >> in the
> > > >> > > > >> producer to define topic-level acks, with the existing acks
> > > >> being the
> > > >> > > > >> default for all topics. This approach is not only simple
> but
> > > also
> > > >> > > easy to
> > > >> > > > >> understand and implement.
> > > >> > > > >>
> > > >> > > > >> Best,
> > > >> > > > >> Chia-Ping
> > > >> > > > >>
> > > >> > > > >> On 2024/11/13 16:04:24 Andrew Schofield wrote:
> > > >> > > > >> > Hi TaiJuWu,
> > > >> > > > >> > I've been thinking for a while about this KIP before
> jumping
> > > >> into
> > > >> > > the
> > > >> > > > >> discussion.
> > > >> > > > >> >
> > > >> > > > >> > I'm afraid that I don't think the approach in the KIP is
> the
> > > >> best,
> > > >> > > > >> given the design
> > > >> > > > >> > of the Kafka protocol in this area. Essentially, each
> Produce
> > > >> > > request
> > > >> > > > >> contains
> > > >> > > > >> > the acks value at the top level, and may contain records
> for
> > > >> many
> > > >> > > > >> topics or
> > > >> > > > >> > partitions. My point is that batching occurs at the
> level of
> > > a
> > > >> > > Produce
> > > >> > > > >> request,
> > > >> > > > >> > so changing the acks value between records will require
> a new
> > > >> > > Produce
> > > >> > > > >> request
> > > >> > > > >> > to be sent. There would likely be an efficiency penalty
> if
> > > this
> > > >> > > feature
> > > >> > > > >> was used
> > > >> > > > >> > heavily with the acks changing record by record.
> > > >> > > > >> >
> > > >> > > > >> > I can see that potentially an application might want
> > > different
> > > >> ack
> > > >> > > > >> levels for
> > > >> > > > >> > different topics, but I would be surprised if they use
> > > >> different ack
> > > >> > > > >> levels within
> > > >> > > > >> > the same topic. Maybe David's suggestion of defining the
> acks
> > > >> per
> > > >> > > topic
> > > >> > > > >> > would be enough. What do you think?
> > > >> > > > >> >
> > > >> > > > >> > Thanks,
> > > >> > > > >> > Andrew
> > > >> > > > >> > ________________________________________
> > > >> > > > >> > From: David Jacot <dja...@confluent.io.INVALID>
> > > >> > > > >> > Sent: 13 November 2024 15:31
> > > >> > > > >> > To: dev@kafka.apache.org <dev@kafka.apache.org>
> > > >> > > > >> > Subject: Re: [DISCUSS]KIP-1107: Adding record-level acks
> for
> > > >> > > producers
> > > >> > > > >> >
> > > >> > > > >> > Hi TaiJuWu,
> > > >> > > > >> >
> > > >> > > > >> > Thanks for the KIP.
> > > >> > > > >> >
> > > >> > > > >> > The motivation is not clear to me. Could you please
> elaborate
> > > >> a bit
> > > >> > > > >> more on
> > > >> > > > >> > it?
> > > >> > > > >> >
> > > >> > > > >> > My concern is that it adds a lot of complexity and the
> added
> > > >> value
> > > >> > > > >> seems to
> > > >> > > > >> > be low. Moreover, it will make reasoning about an
> application
> > > >> from
> > > >> > > the
> > > >> > > > >> > server side more difficult because we can no longer
> assume
> > > >> that it
> > > >> > > > >> writes
> > > >> > > > >> > with the ack based on the config. Another issue is about
> the
> > > >> > > batching,
> > > >> > > > >> how
> > > >> > > > >> > do you plan to handle batches mixing records with
> different
> > > >> acks?
> > > >> > > > >> >
> > > >> > > > >> > An alternative approach may be to define the ack per
> topic.
> > > We
> > > >> could
> > > >> > > > >> even
> > > >> > > > >> > think about defining it on the server side as a topic
> > > config. I
> > > >> > > haven't
> > > >> > > > >> > really thought about it but it may be something to
> explore a
> > > >> bit
> > > >> > > more.
> > > >> > > > >> >
> > > >> > > > >> > Best,
> > > >> > > > >> > David
> > > >> > > > >> >
> > > >> > > > >> > On Wed, Nov 13, 2024 at 3:56 PM Frédérik Rouleau
> > > >> > > > >> > <froul...@confluent.io.invalid> wrote:
> > > >> > > > >> >
> > > >> > > > >> > > Hi TaiJuWu,
> > > >> > > > >> > >
> > > >> > > > >> > > I find this adding lot's of complexity and I am still
> not
> > > >> > > convinced
> > > >> > > > >> by the
> > > >> > > > >> > > added value. IMO creating a producer instance per ack
> level
> > > >> is not
> > > >> > > > >> > > problematic and the behavior is clear for developers.
> What
> > > >> would
> > > >> > > be
> > > >> > > > >> the
> > > >> > > > >> > > added value of the proposed change ?
> > > >> > > > >> > >
> > > >> > > > >> > > Regards,
> > > >> > > > >> > >
> > > >> > > > >> > >
> > > >> > > > >> > > On Wed, Nov 6, 2024 at 7:50 AM TaiJu Wu <
> > > tjwu1...@gmail.com>
> > > >> > > wrote:
> > > >> > > > >> > >
> > > >> > > > >> > > > Hi Fred and Greg,
> > > >> > > > >> > > >
> > > >> > > > >> > > > Thanks for your feedback and it really not
> > > straightforward
> > > >> but
> > > >> > > > >> > > interesting!
> > > >> > > > >> > > > There are some behavior I expect.
> > > >> > > > >> > > >
> > > >> > > > >> > > > The current producer uses the *RecordAccumulator* to
> > > gather
> > > >> > > > >> records, and
> > > >> > > > >> > > > the sender thread sends them in batches. We can track
> > > each
> > > >> > > record’s
> > > >> > > > >> > > > acknowledgment setting as it appends to the
> > > >> *RecordAccumulator*,
> > > >> > > > >> allowing
> > > >> > > > >> > > > the *sender *to group batches by acknowledgment
> levels
> > > and
> > > >> > > > >> topicPartition
> > > >> > > > >> > > > when processing.
> > > >> > > > >> > > >
> > > >> > > > >> > > > Regarding the statement, "Callbacks for records being
> > > sent
> > > >> to
> > > >> > > the
> > > >> > > > >> same
> > > >> > > > >> > > > partition are guaranteed to execute in order," this
> is
> > > >> ensured
> > > >> > > when
> > > >> > > > >> > > > *max.inflight.request
> > > >> > > > >> > > > *is set to 1. We can send records with different
> > > >> acknowledgment
> > > >> > > > >> levels in
> > > >> > > > >> > > > the order of acks-0, acks=1, acks=-1. Since we need
> to
> > > send
> > > >> > > batches
> > > >> > > > >> with
> > > >> > > > >> > > > different acknowledgment levels batches to the
> broker,
> > > the
> > > >> > > callback
> > > >> > > > >> will
> > > >> > > > >> > > > execute after each request is completed.
> > > >> > > > >> > > >
> > > >> > > > >> > > > In response to, "If so, are low-acks records subject
> to
> > > >> > > head-of-line
> > > >> > > > >> > > > blocking from high-acks records?," I believe an
> > > additional
> > > >> > > > >> configuration
> > > >> > > > >> > > is
> > > >> > > > >> > > > necessary to control this behavior. We could allow
> > > records
> > > >> to be
> > > >> > > > >> either
> > > >> > > > >> > > > sync or async, though the callback would still
> execute
> > > >> after
> > > >> > > each
> > > >> > > > >> batch
> > > >> > > > >> > > > with varying acknowledgment levels completes. To
> measure
> > > >> > > behavior
> > > >> > > > >> across
> > > >> > > > >> > > > acknowledgment levels, we could also include acks in
> > > >> > > > >> > > *ProducerIntercepor*.
> > > >> > > > >> > > >
> > > >> > > > >> > > > Furthermore, before this KIP, a producer could only
> > > >> include one
> > > >> > > acks
> > > >> > > > >> > > level
> > > >> > > > >> > > > so sequence is premised. However, with this change,
> we
> > > can
> > > >> > > *ONLY*
> > > >> > > > >> > > guarantee
> > > >> > > > >> > > > the sequence within records of the same
> acknowledgment
> > > >> level
> > > >> > > > >> because we
> > > >> > > > >> > > may
> > > >> > > > >> > > > send up to three separate requests to brokers.
> > > >> > > > >> > > > Best,
> > > >> > > > >> > > > TaiJuWu
> > > >> > > > >> > > >
> > > >> > > > >> > > >
> > > >> > > > >> > > > TaiJu Wu <tjwu1...@gmail.com> 於 2024年11月6日 週三
> 上午10:01寫道:
> > > >> > > > >> > > >
> > > >> > > > >> > > > > Hi  Fred and Greg,
> > > >> > > > >> > > > >
> > > >> > > > >> > > > > Apologies for the delayed response.
> > > >> > > > >> > > > > Yes, you’re correct.
> > > >> > > > >> > > > > I’ll outline the behavior I expect.
> > > >> > > > >> > > > >
> > > >> > > > >> > > > > Thanks for your feedback!
> > > >> > > > >> > > > >
> > > >> > > > >> > > > > Best,
> > > >> > > > >> > > > > TaiJuWu
> > > >> > > > >> > > > >
> > > >> > > > >> > > > >
> > > >> > > > >> > > > > Greg Harris <greg.har...@aiven.io.invalid> 於
> > > 2024年11月6日
> > > >> 週三
> > > >> > > > >> 上午9:48寫道:
> > > >> > > > >> > > > >
> > > >> > > > >> > > > >> Hi TaiJuWu,
> > > >> > > > >> > > > >>
> > > >> > > > >> > > > >> Thanks for the KIP!
> > > >> > > > >> > > > >>
> > > >> > > > >> > > > >> Can you explain in the KIP about the behavior
> when the
> > > >> > > number of
> > > >> > > > >> acks
> > > >> > > > >> > > is
> > > >> > > > >> > > > >> different for individual records? I think the
> current
> > > >> > > description
> > > >> > > > >> > > using
> > > >> > > > >> > > > >> the
> > > >> > > > >> > > > >> word "straightforward" does little to explain
> that,
> > > and
> > > >> may
> > > >> > > > >> actually
> > > >> > > > >> > > be
> > > >> > > > >> > > > >> hiding some complexity.
> > > >> > > > >> > > > >>
> > > >> > > > >> > > > >> For example, the send() javadoc contains this:
> > > >> "Callbacks for
> > > >> > > > >> records
> > > >> > > > >> > > > >> being
> > > >> > > > >> > > > >> sent to the same partition are guaranteed to
> execute
> > > in
> > > >> > > order."
> > > >> > > > >> Is
> > > >> > > > >> > > this
> > > >> > > > >> > > > >> still true when acks vary for records within the
> same
> > > >> > > partition?
> > > >> > > > >> > > > >> If so, are low-acks records subject to
> > > >> head-of-line-blocking
> > > >> > > from
> > > >> > > > >> > > > >> high-acks
> > > >> > > > >> > > > >> records? It seems to me that this feature is
> useful
> > > when
> > > >> > > acks is
> > > >> > > > >> > > > specified
> > > >> > > > >> > > > >> per-topic, but introduces a lot of edge cases
> that are
> > > >> > > > >> underspecified.
> > > >> > > > >> > > > >>
> > > >> > > > >> > > > >> Thanks,
> > > >> > > > >> > > > >> Greg
> > > >> > > > >> > > > >>
> > > >> > > > >> > > > >>
> > > >> > > > >> > > > >> On Tue, Nov 5, 2024 at 4:52 PM TaiJu Wu <
> > > >> tjwu1...@gmail.com>
> > > >> > > > >> wrote:
> > > >> > > > >> > > > >>
> > > >> > > > >> > > > >> > Hi Chia-Ping,
> > > >> > > > >> > > > >> >
> > > >> > > > >> > > > >> > Thanks for your feedback.
> > > >> > > > >> > > > >> > I have updated KIP based on your suggestions.
> > > >> > > > >> > > > >> >
> > > >> > > > >> > > > >> > Best,
> > > >> > > > >> > > > >> > Stanley
> > > >> > > > >> > > > >> >
> > > >> > > > >> > > > >> > Chia-Ping Tsai <chia7...@apache.org> 於
> 2024年11月5日
> > > 週二
> > > >> > > 下午4:41寫道:
> > > >> > > > >> > > > >> >
> > > >> > > > >> > > > >> > > hi TaiJuWu,
> > > >> > > > >> > > > >> > >
> > > >> > > > >> > > > >> > > Q0: Could you please add getter (Short
> acks()) to
> > > >> "public
> > > >> > > > >> > > interface"
> > > >> > > > >> > > > >> > > section?
> > > >> > > > >> > > > >> > >
> > > >> > > > >> > > > >> > > Q1: Could you please add RPC json reference to
> > > prove
> > > >> > > "been
> > > >> > > > >> > > available
> > > >> > > > >> > > > >> at
> > > >> > > > >> > > > >> > > the RPC-level,"
> > > >> > > > >> > > > >> > >
> > > >> > > > >> > > > >> > > Q2: Could you please add link to producer
> docs to
> > > >> prove
> > > >> > > > >> "share a
> > > >> > > > >> > > > >> single
> > > >> > > > >> > > > >> > > producer instance across multiple threads"
> > > >> > > > >> > > > >> > >
> > > >> > > > >> > > > >> > > Thanks,
> > > >> > > > >> > > > >> > > Chia-Ping
> > > >> > > > >> > > > >> > >
> > > >> > > > >> > > > >> > > On 2024/11/05 01:28:36 吳岱儒 wrote:
> > > >> > > > >> > > > >> > > > Hi all,
> > > >> > > > >> > > > >> > > >
> > > >> > > > >> > > > >> > > > I open a KIP-1107: Adding record-level acks
> for
> > > >> > > producers
> > > >> > > > >> > > > >> > > > <
> > > >> > > > >> > > > >> > >
> > > >> > > > >> > > > >> >
> > > >> > > > >> > > > >>
> > > >> > > > >> > > >
> > > >> > > > >> > >
> > > >> > > > >>
> > > >> > >
> > > >>
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1107%3A++Adding+record-level+acks+for+producers
> > > >> > > > >> > > > >> > > >
> > > >> > > > >> > > > >> > > > to
> > > >> > > > >> > > > >> > > > reduce the limitation associated with
> reusing
> > > >> > > > >> KafkaProducer.
> > > >> > > > >> > > > >> > > >
> > > >> > > > >> > > > >> > > >
> > > >> > > > >> > > > >> > >
> > > >> > > > >> > > > >> >
> > > >> > > > >> > > > >>
> > > >> > > > >> > > >
> > > >> > > > >> > >
> > > >> > > > >>
> > > >> > >
> > > >>
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1107%3A++Adding+record-level+acks+for+producers
> > > >> > > > >> > > > >> > > >
> > > >> > > > >> > > > >> > > > Feedbacks and suggestions are welcome.
> > > >> > > > >> > > > >> > > >
> > > >> > > > >> > > > >> > > > Thanks,
> > > >> > > > >> > > > >> > > > TaiJuWu
> > > >> > > > >> > > > >> > > >
> > > >> > > > >> > > > >> > >
> > > >> > > > >> > > > >> >
> > > >> > > > >> > > > >>
> > > >> > > > >> > > > >
> > > >> > > > >> > > >
> > > >> > > > >> > >
> > > >> > > > >> >
> > > >> > > > >>
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > >
> >
>

Reply via email to