Thanks Boyang, the newly added example looks good to me.

On Thu, Apr 9, 2020 at 2:47 PM Boyang Chen <reluctanthero...@gmail.com>
wrote:

> Hey Guozhang,
>
> I have added an example of the producer API usage under new improvements.
> Let me know if this looks good to you.
>
> Boyang
>
> On Wed, Apr 8, 2020 at 1:38 PM Boyang Chen <reluctanthero...@gmail.com>
> wrote:
>
> > That's a good suggestion Jason. Adding a dedicated PRODUCER_FENCED error
> > should help distinguish exceptions and could safely mark
> > INVALID_PRODUCER_EPOCH exception as non-fatal in the new code. Updated
> the
> > KIP.
> >
> > Boyang
> >
> > On Wed, Apr 8, 2020 at 12:18 PM Jason Gustafson <ja...@confluent.io>
> > wrote:
> >
> >> Hey Boyang,
> >>
> >> Thanks for the KIP. I think the main problem we've identified here is
> that
> >> the current errors conflate transaction timeouts with producer fencing.
> >> The
> >> first of these ought to be recoverable, but we cannot distinguish it.
> The
> >> suggestion to add a new error code makes sense to me, but it leaves this
> >> bit of awkwardness:
> >>
> >> > One extra issue that needs to be addressed is how to handle
> >> `ProducerFenced` from Produce requests.
> >>
> >> In fact, the underlying error code here is INVALID_PRODUCER_EPOCH. It's
> >> just that the code treats this as equivalent to `ProducerFenced`. One
> >> thought I had is maybe PRODUCER_FENCED needs to be a separate error code
> >> as
> >> well. After all, only the transaction coordinator knows whether a
> producer
> >> has been fenced or not. So maybe the handling could be something like
> the
> >> following:
> >>
> >> 1. Produce requests may return INVALID_PRODUCER_EPOCH. The producer
> >> recovers by following KIP-360 logic to see whether the epoch can be
> >> bumped.
> >> If it cannot because the broker version is too old, we fail.
> >> 2. Transactional APIs may return either TRANSACTION_TIMEOUT or
> >> PRODUCER_FENCED. In the first case, we do the same as above. We try to
> >> recover by bumping the epoch. If the error is PRODUCER_FENCED, it is
> >> fatal.
> >> 3. Older brokers may return INVALID_PRODUCER_EPOCH as well from
> >> transactional APIs. We treat this the same as 1.
> >>
> >> What do you think?
> >>
> >> Thanks,
> >> Jason
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Mon, Apr 6, 2020 at 3:41 PM Boyang Chen <reluctanthero...@gmail.com>
> >> wrote:
> >>
> >> > Yep, updated the KIP, thanks!
> >> >
> >> > On Mon, Apr 6, 2020 at 3:11 PM Guozhang Wang <wangg...@gmail.com>
> >> wrote:
> >> >
> >> > > Regarding 2), sounds good, I saw UNKNOWN_PRODUCER_ID is properly
> >> handled
> >> > > today in produce / add-partitions-to-txn / add-offsets-to-txn /
> >> end-txn
> >> > > responses, so that should be well covered.
> >> > >
> >> > > Could you reflect this in the wiki page that the broker should be
> >> > > responsible for using different error codes given client request
> >> versions
> >> > > as well?
> >> > >
> >> > >
> >> > >
> >> > > Guozhang
> >> > >
> >> > > On Mon, Apr 6, 2020 at 9:20 AM Boyang Chen <
> >> reluctanthero...@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > Thanks Guozhang for the review!
> >> > > >
> >> > > > On Sun, Apr 5, 2020 at 5:47 PM Guozhang Wang <wangg...@gmail.com>
> >> > wrote:
> >> > > >
> >> > > > > Hello Boyang,
> >> > > > >
> >> > > > > Thank you for the proposed KIP. Just some minor comments below:
> >> > > > >
> >> > > > > 1. Could you also describe which producer APIs could potentially
> >> > throw
> >> > > > the
> >> > > > > new TransactionTimedOutException, and also how should callers
> >> handle
> >> > > them
> >> > > > > differently (i.e. just to make your description more concrete as
> >> > > > javadocs).
> >> > > > >
> >> > > > > Good point, I will add example java doc changes.
> >> > > >
> >> > > >
> >> > > > > 2. It's straight-forward if client is on newer version while
> >> broker's
> >> > > on
> >> > > > > older version; however If the client is on older version while
> >> > broker's
> >> > > > on
> >> > > > > newer version, today would the internal module of producers
> treat
> >> it
> >> > > as a
> >> > > > > general fatal error or not? If not, should the broker set a
> >> different
> >> > > > error
> >> > > > > code upon detecting older request versions?
> >> > > > >
> >> > > > > That's a good suggestion, my understanding is that the
> >> prerequisite
> >> > of
> >> > > > this change is the new KIP-360 API which is going out with 2.5,
> >> > > > so we could just return UNKNOWN_PRODUCER_ID instead of
> >> PRODUCER_FENCED
> >> > as
> >> > > > it could be interpreted as abortable error
> >> > > > in 2.5 producer and retry. Producers older than 2.5 will not be
> >> > covered.
> >> > > > WDYT?
> >> > > >
> >> > > > >
> >> > > > > Guozhang
> >> > > > >
> >> > > > > On Thu, Apr 2, 2020 at 1:40 PM Boyang Chen <
> >> > reluctanthero...@gmail.com
> >> > > >
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Hey there,
> >> > > > > >
> >> > > > > > I would like to start discussion for KIP-588:
> >> > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-588%3A+Allow+producers+to+recover+gracefully+from+transaction+timeouts
> >> > > > > >
> >> > > > > > which aims to improve Producer resilience to transaction
> timeout
> >> > due
> >> > > to
> >> > > > > > transient system gaps.
> >> > > > > >
> >> > > > > > Best,
> >> > > > > > Boyang
> >> > > > > >
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > > -- Guozhang
> >> > > > >
> >> > > >
> >> > >
> >> > >
> >> > > --
> >> > > -- Guozhang
> >> > >
> >> >
> >>
> >
>


-- 
-- Guozhang

Reply via email to