Re: [DISCUSS] KIP-1038: Add Custom Error Handler to Producer

Chris Egerton Wed, 17 Apr 2024 10:19:11 -0700

Hi Alieh,

Thanks for the KIP! The issue with writing to non-existent topics is
particularly frustrating for users of Kafka Connect and has been the source
of a handful of Jira tickets over the years. My thoughts:

1. An additional detail we can add to the motivation (or possibly rejected
alternatives) section is that this kind of custom retry logic can't be
implemented by hand by, e.g., setting retries to 0 in the producer config
and handling exceptions at the application level. Or rather, it can, but 1)
it's a bit painful to have to reimplement at every call-site for
Producer::send (and any code that awaits the returned Future) and 2) it's
impossible to do this without losing idempotency on retries.

2. That said, I wonder if a pluggable interface is really the right call
here. Documenting the interactions of a producer with
a ClientExceptionHandler instance will be tricky, and implementing them
will also be a fair amount of work. I believe that there needs to be some
more granularity for how writes to non-existent topics (or really,
UNKNOWN_TOPIC_OR_PARTITION and related errors from the broker) are handled,
but I'm torn between keeping it simple with maybe one or two new producer
config properties, or a full-blown pluggable interface. If there are more
cases that would benefit from a pluggable interface, it would be nice to
identify these and add them to the KIP to strengthen the motivation. Right
now, I'm not sure the two that are mentioned in the motivation are
sufficient.

3. Alternatively, a possible compromise is for this KIP to introduce new
properties that dictate how to handle unknown-topic-partition and
record-too-large errors, with the thinking that if we introduce a pluggable
interface later on, these properties will be recognized by the default
implementation of that interface but could be completely ignored or
replaced by alternative implementations.

4. (Nit) You can remove the "This page is meant as a template for writing a
KIP..." part from the KIP. It's not a template anymore :)

5. If we do go the pluggable interface route, wouldn't we want to add the
possibility for retry logic? The simplest version of this could be to add a
RETRY value to the ClientExceptionHandlerResponse enum.

6. I think "SKIP" or "DROP" might be clearer instead of "CONTINUE" for
the ClientExceptionHandlerResponse enum, since they cause records to be
dropped.

Cheers,

Chris

On Wed, Apr 17, 2024 at 12:25 PM Justine Olshan
<jols...@confluent.io.invalid> wrote:

> Hey Alieh,
>
> I echo what Omnia says, I'm not sure I understand the implications of the
> change and I think more detail is needed.
>
> This comment also confused me a bit:
> * {@code ClientExceptionHandler} that continues the transaction even if a
> record is too large.
> * Otherwise, it makes the transaction to fail.
>
> Relatedly, I've been working with some folks on a KIP for transactions
> errors and how they are handled. Specifically for the
> RecordTooLargeException (and a few other errors), we want to give a new
> error category for this error that allows the application to choose how it
> is handled. Maybe this KIP is something that you are looking for? Stay
> tuned :)
>
> Justine
>
>
>
>
>
> On Wed, Apr 17, 2024 at 8:03 AM Omnia Ibrahim <o.g.h.ibra...@gmail.com>
> wrote:
>
> > Hi Alieh,
> > Thanks for the KIP! I have couple of comments
> > - You mentioned in the KIP motivation,
> > > Another example for which a production exception handler could be
> useful
> > is if a user tries to write into a non-existing topic, which returns a
> > retryable error code; with infinite retries, the producer would hang
> > retrying forever. A handler could help to break the infinite retry loop.
> >
> > How the handler can differentiate between something that is temporary and
> > it should keep retrying and something permanent like forgot to create the
> > topic? temporary here could be
> >  the producer get deployed before the topic creation finish (specially if
> > the topic creation is handled via IaC)
> >  temporary offline partitions
> >  leadership changing
> >         Isn’t this putting the producer at risk of dropping records
> > unintentionally?
> > - Can you elaborate more on what is written in the compatibility /
> > migration plan section please by explaining in bit more details what is
> the
> > changing behaviour and how this will impact client who are upgrading?
> > - In the proposal changes can you elaborate in the KIP where in the
> > producer lifecycle will ClientExceptionHandler and
> > TransactionExceptionHandler get triggered, and how will the producer
> > configure them to point to costumed implementation.
> >
> > Thanks
> > Omnia
> >
> > > On 17 Apr 2024, at 13:13, Alieh Saeedi <asae...@confluent.io.INVALID>
> > wrote:
> > >
> > > Hi all,
> > > Here is the KIP-1038: Add Custom Error Handler to Producer.
> > > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1038%3A+Add+Custom+Error+Handler+to+Producer
> > >
> > > I look forward to your feedback!
> > >
> > > Cheers,
> > > Alieh
> >
> >
>

Re: [DISCUSS] KIP-1038: Add Custom Error Handler to Producer

Reply via email to