Boyang,

Thanks for the update. This change makes sense to me.

Guozhang

On Tue, Feb 11, 2020 at 11:37 AM Boyang Chen <reluctanthero...@gmail.com>
wrote:

> Hey there,
>
> we are adding a small change to the KIP-447 public API. The default value
> of `transaction.abort.timed.out.transaction.cleanup.interval.ms` shall be
> changed from 1 minute to 10 seconds. The goal here is to trigger the
> expired transaction more frequently in order to reduce the consumer pending
> offset fetch wait time.
>
> Let me know if you have further questions, thanks!
>
>
> On Wed, Jan 8, 2020 at 3:44 PM Boyang Chen <reluctanthero...@gmail.com>
> wrote:
>
> > Thanks Guozhang for another review! I have addressed all the javadoc
> > changes for PendingTransactionException in the KIP. For
> FENCED_INSTANCE_ID
> > the only thrown place would be on the new send offsets API, which is also
> > addressed.
> >
> > Thanks Matthias for the vote! As we have 3 binding votes (Guozhang,
> Jason,
> > and Matthias), the KIP is officially accepted and prepared to ship in
> 2.5.
> >
> > Still feel free to put more thoughts on either discussion or voting
> thread
> > to refine the KIP!
> >
> >
> > On Wed, Jan 8, 2020 at 3:15 PM Matthias J. Sax <matth...@confluent.io>
> > wrote:
> >
> >> I just re-read the KIP. Overall I am +1 as well.
> >>
> >
> >> Some minor comments (also apply to the Google design doc):
> >>
> >> 1) As 2.4 was release, references should be updated to 2.5.
> >>
> >>  Addressed
> >
> >>
> >>
> >>>  2) About the upgrade path, the KIP says:
> >>
> >> 2a)
> >>
> >> > Broker must be upgraded to 2.4 first. This means the
> >> `inter.broker.protocol.version` (IBP) has to be set to the latest. Any
> >> produce request with higher version will automatically get fenced
> because
> >> of no support.
> >>
> >> From my understanding, this is not correct? After a broker is updated to
> >> the new binaries, it should accept new requests, even if IBP was not
> >> bumped yet?
> >>
> >> Your understanding was correct, after some offline discussion we should
> > not worry about IBP in this case.
> >
> >> 2b)
> >>
> >> About the two rolling bounces for KS apps and the statement
> >>
> >> > one should never allow task producer and thread producer under the
> same
> >> application group
> >>
> >> In the second rolling bounce, we might actually mix both (ie, per-task
> >> and per-thread producers) but this is fine as explained in the KIP. The
> >> only case we cannot allow is, old per-task producers (without consumer
> >> generation fencing) to be mixed with per-thread producers (that rely
> >> solely on consumer generation fencing).
> >>
> >> Does this sound correct?
> >>
> >> Correct, that's the purpose of doing 2 rolling bounce, where the first
> > one is to guarantee everyone's opt-in for generation fencing.
> >
> >>
> >> 3) We should also document how users can use KS 2.5 applications against
> >> older brokers -- for this case, we need to stay on per-task producers
> >> and cannot use the new fencing mechanism. Currently, the KIP only
> >> describe a single way how users could make this work: by setting (and
> >> keeping) UPGRADE_FROM config to 2.4 (what might not be an ideal solution
> >> and might also not be clear by itself that people would need to do
> this)?
> >>
> >>
> >> Yes exactly, at the moment we are actively working on a plan to acquire
> > broker's IBP during stream start-up and initialize based off that
> > information,
> > so that user doesn't need to keep UPGRADE_FROM simply for working with
> old
> > brokers.
> >
> >>
> >> -Matthias
> >>
> >>
> >>
> >> On 9/18/19 4:41 PM, Boyang Chen wrote:
> >> > Bump this thread to see if someone could also review!
> >> >
> >> > On Mon, Sep 9, 2019 at 5:00 PM Boyang Chen <
> reluctanthero...@gmail.com>
> >> > wrote:
> >> >
> >> >> Thank you Jason! Addressed the comments.
> >> >>
> >> >> Thank you Guozhang for explaining. I will document the timeout
> setting
> >> >> reasoning in the design doc.
> >> >>
> >> >>
> >> >> On Mon, Sep 9, 2019 at 1:49 PM Guozhang Wang <wangg...@gmail.com>
> >> wrote:
> >> >>
> >> >>> On Fri, Sep 6, 2019 at 6:33 PM Boyang Chen <
> >> reluctanthero...@gmail.com>
> >> >>> wrote:
> >> >>>
> >> >>>> Thanks Guozhang, I have polished the design doc to make it sync
> with
> >> >>>> current KIP. As for overriding default timeout values, I guess it's
> >> >>> already
> >> >>>> stated in the KIP to set txn timeout to 10s, are you suggesting we
> >> >>> should
> >> >>>> also put down this recommendation on the KIP for non-stream EOS
> >> users?
> >> >>>>
> >> >>>> My comment is not for changing any produce / consumer default
> config
> >> >>> values, but for the Streams configs, to make sure that our
> >> >>> overridden config values respect the above rules. That is, we check
> >> the
> >> >>> actual value used in the config if they are ever overridden by
> users,
> >> and
> >> >>> if the above were not true we can log a warning that it may be risky
> >> to
> >> >>> encounter some unnecessary rebalances.
> >> >>>
> >> >>> Again, this is not something we need to include in the KIP since it
> >> is not
> >> >>> part of public APIs, just to emphasize that the internal
> >> implementation
> >> >>> can
> >> >>> have some safety guard like this.
> >> >>>
> >> >>> Guozhang
> >> >>>
> >> >>>
> >> >>>
> >> >>>> Boyang
> >> >>>>
> >> >>>> On Thu, Sep 5, 2019 at 8:43 PM Guozhang Wang <wangg...@gmail.com>
> >> >>> wrote:
> >> >>>>
> >> >>>>> Hello Boyang,
> >> >>>>>
> >> >>>>> Just realized one thing about timeout configurations that we
> should
> >> >>>>> consider including in this KIP as well:
> >> >>>>>
> >> >>>>> 1) In Producer we have: max.block.ms (default value 60sec),
> >> >>>>> request.timeout
> >> >>>>> (30sec), delivery.timeout.ms (120sec), transaction.timeout
> (60sec)
> >> >>>>> 2) In Consumer we have: session.timeout (10sec), request.timeout
> >> >>> (30sec),
> >> >>>>> default.api.timeout.ms (60sec).
> >> >>>>>
> >> >>>>> Within a transaction (i.e. after we've beginTxn), we could
> >> potentially
> >> >>>> call
> >> >>>>> consumer blocking APIs that depend on default.api.timeout.ms, and
> >> >>> call
> >> >>>>> producer blocking APIs that depend on max.block.ms. Also, if the
> >> >>> user is
> >> >>>>> following a consumer->producer pattern, then it could be kicked
> and
> >> >>>> fenced
> >> >>>>> either by txn or by consumer group session.
> >> >>>>>
> >> >>>>> So we want to make sure that in the caller, e.g. Kafka Streams:
> >> >>>>>
> >> >>>>> 1) transaction.timeout < max.block.ms
> >> >>>>> 2) transaction.timeout < delivery.timeout.ms
> >> >>>>> 3) transaction.timeout < default.api.timeout.ms
> >> >>>>> 4) transaction.timeout ~= default.api.timeout.ms (I think this is
> >> >>>> already
> >> >>>>> mentioned in the KIP, just wanted to bring this up again)
> >> >>>>>
> >> >>>>> We do not need to override the default since not every users are
> >> >>>> following
> >> >>>>> the consumer -> producer pattern, but in cases like Streams where
> it
> >> >>> is
> >> >>>>> indeed the case, we should override the default values to obey the
> >> >>> above
> >> >>>>> rules.
> >> >>>>>
> >> >>>>> Guozhang
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> On Thu, Sep 5, 2019 at 5:47 PM Guozhang Wang <wangg...@gmail.com>
> >> >>> wrote:
> >> >>>>>
> >> >>>>>> Thanks Boyang, I'm +1 on the KIP.
> >> >>>>>>
> >> >>>>>> Could you also update the detailed design doc
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >>
> https://docs.google.com/document/d/1LhzHGeX7_Lay4xvrEXxfciuDWATjpUXQhrEIkph9qRE/edit
> >> >>>>> which
> >> >>>>>> seems a bit out-dated with the latest proposal?
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> Guozhang
> >> >>>>>>
> >> >>>>>> On Wed, Sep 4, 2019 at 10:45 AM Boyang Chen <
> >> >>>> reluctanthero...@gmail.com>
> >> >>>>>> wrote:
> >> >>>>>>
> >> >>>>>>> Hey all,
> >> >>>>>>>
> >> >>>>>>> I would like to start the vote for KIP-447
> >> >>>>>>> <
> >> >>>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-447%3A+Producer+scalability+for+exactly+once+semantics
> >> >>>>>>>> .
> >> >>>>>>> This is a very important step to improve Kafka Streams
> scalability
> >> >>> in
> >> >>>>>>> exactly-once semantics, by avoiding linearly increasing number
> of
> >> >>>>>>> producers
> >> >>>>>>> with topic partition increases.
> >> >>>>>>>
> >> >>>>>>> Thanks,
> >> >>>>>>> Boyang
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> --
> >> >>>>>> -- Guozhang
> >> >>>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> --
> >> >>>>> -- Guozhang
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> -- Guozhang
> >> >>>
> >> >>
> >> >
> >>
> >>
>


-- 
-- Guozhang

Reply via email to