Bump this thread to see if there are any comments/thoughts.
Thanks.

Luke

On Mon, Sep 26, 2022 at 11:06 AM Luke Chen <show...@gmail.com> wrote:

> Hi devs,
>
> As stated in the motivation section in KIP-854
> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-854+Separate+configuration+for+producer+ID+expiry>:
>
> With idempotent producers becoming the default in Kafka, this means that
> unless otherwise specified, all new producers will be given producer IDs.
> Some (inefficient) applications may now create many non-transactional
> idempotent producers. Each of these producers will be assigned a producer
> ID and these IDs and their metadata are stored in the broker memory, which
> might cause brokers out of memory.
>
> Justine (in cc.) and I and some other team members are working on the
> solutions for this issue. But none of them solves it completely without
> side effects. Among them, "availability" VS "idempotency guarantees" is
> what we can't decide which to sacrifice. Some of these solutions sacrifice
> availability of produce (1,2,5) and others sacrifice idempotency guarantees
> (3). It could be useful to know if people generally have a preference one
> way or the other. Or what other better solutions there might be.
>
> Here are the proposals we came up with:
>
> 1. Limit the total active producer ID allocation number.
> -> This is the simplest solution. But since the OOM issue is usually
> caused by a rogue or misconfigured client, and this solution might "punish"
> the good client from sending messages.
>
> 2. Throttling the producer ID allocation rate
> -> Same concern as the solution #1.
>
> 3. Having a limit to the number of active producer IDs (sort of like an
> LRU cache)
> -> The idea here is that if we hit a misconfigured client, we will expire
> the older entries. The concern here is we have risks to lose idempotency
> guarantees, and currently, we don't have a way to notify clients about
> losing idempotency guarantees. Besides, the least  recently used entries
> got removed are not always from the "bad" clients.
>
> 4. allow clients to "close" the producer ID usage
> -> We can provide a way for producer to "close" producerID usage.
> Currently, we only have a way to INIT_PRODUCER_ID requested to allocate
> one. After that, we'll keep the producer ID metadata in broker even if the
> producer is "closed". Having a closed API (ex: END_PRODUCER_ID), we can
> remove the entry from broker side. In client side, we can send it when
> producer closing. The concern is, the old clients (including non-java
> clients) will still suffer from the OOM issue.
>
> 5. limit/throttling the producer id based on the principle
> -> Although we can limit the impact to a certain principle with this idea,
> same concern still exists as solution #1 #2.
>
> Any thoughts/feedback are welcomed.
>
> Thank you.
> Luke
>

Reply via email to