Thanks Guozhang for the summary. If everyone agrees to the strategy here, which include:
1. No affection to 2.5 release, maintaining the same crashing logic for new Producer API talking to old clients 2. Add an internal flag to silently downgrade protocol for Kafka Streams case only I will go ahead to make the corresponding changes to unblock. Boyang On Mon, Mar 30, 2020 at 11:22 AM Guozhang Wang <wangg...@gmail.com> wrote: > Hello Colin, > > Thanks for the context you provided with CreateAclsRequest to make the > point for compatibility policy, and I think they make sense -- given > KIP-447 is primarily motivated for Streams clients, we should still > consider it's potential compatibility pitfall for non-Streams clients. > > About the resolution here, personally I do not think that punt this KIP for > the next release is a good idea; instead I think we can make the change for > Streams only while keeping non-Streams to still throw --- Boyang have > proposed with the internal flag do to so, and I'd want to add that we > should emphasize on docs when UnsupportedVersion is thrown on producer it > might already be too late (as shown in the example in KIP wiki, in such a > consumer -> process -> producer pattern when producer throws some > processing may already been done with irreversible side-effects) so that > users are aware. > > > Guozhang > > > On Sun, Mar 29, 2020 at 9:07 PM Colin McCabe <cmcc...@apache.org> wrote: > > > Thanks for the explanations, all. It seems to me that the proposed > silent > > downgrade change leaves non-streams applications with no way to safely > use > > KIP-447. If they use it, it will sometimes be safe, but sometimes not, > > depending on what they are doing. > > > > This doesn't seem acceptable. People should be able to rely on EOS when > > it exists. > > > > I agree that it is frustrating that the UnsupportedVersionException > > happens when the applications have already set up many things, rather > than > > being accessible when the application first starts up. But this does not > > fundamentally change the compatibility policy. It is difficult > (actually, > > it's impossible) to work around the lack of CreateAclsRequest or > > DeleteAclsRequest. But this doesn't motivate us to return fake results > for > > the corresponding AdminClient calls when the RPCs are not present. Some > > things can't be faked. > > > > It seems like the right thing to do here would be to expose the KIP-447 > > functionality as a feature flag. Then, applications can test if that > > feature flag is present before they perform a long setup process. They > can > > also decide if they want to provide a fallback path or not. > > > > Of course, feature flags won't make 2.5, so we need some more short-term > > solution here. Either we punt this KIP to the next release, or we find > > some way to make it safe for everyone, not just Streams (possibly by only > > enabling it for streams?) > > > > best, > > Colin > > > > > > On Sat, Mar 28, 2020, at 01:31, Matthias J. Sax wrote: > > > I agree with Guozhang that the issue only arises for the > > > `read-process-write` pattern and thus we should guard at the read level > > > already and thus auto-downgrading the write path commitTx request seems > > > fine. > > > > > > Note that for producer-only and consumer-only apps the auto-downgrade > is > > > totally fine (and actually desired, especially for the consumer to > > > guarantee backward compatibility as there is not public API change) for > > > the corresponding request. For `read-process-write` with no > exactly-once > > > or the existing exactly-once pattern auto-downgrading is also fine for > > > both clients. > > > > > > The only issue I see atm might be, that the consumer side guard depends > > > on an internal config and there is not public API to enable this guard. > > > Hence, for non-Streams users there might be a gap: If they don't > upgrade > > > their brokers first but switch to the new exactly-once pattern they > > > loose exactly-once protection without any warning... > > > > > > I am frankly not sure how many users would actually build exactly-once > > > apps without using Streams and might be affected? > > > > > > To address the issue I consider the following two options: > > > > > > - educate users about this internal consumer config > > > - make the consumer config public > > > > > > > > > Thoughts? > > > > > > -Matthias > > > > > > > > > > > > > > > On 3/27/20 10:12 PM, Guozhang Wang wrote: > > > > Hello Colin, Ismael, > > > > > > > > Thanks for your feedbacks, they are quite helpful. Just to provide > some > > > > context here about OffsetFetch: > > > > > > > > 1) When building the offset fetch request, we used to auto > "downgrade" > > the > > > > request by falling back the requireStable flag when broker supporter > > > > version is < 7. > > > > > > > > > > > https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/requests/OffsetFetchRequest.java#L82 > > > > > > > > 2) The newly introduced config value "eos-beta" in KIP-447 requires > > brokers > > > > to be on version 2.5 or later, and if broker is not on newer version, > > we > > > > need to let the client to fail fast; in order to do so an internal > > config > > > > from consumer is exposed to Streams, which allows the build function > to > > > > throw the UnsupportedVersionException instead of auto downgrading > when > > > > building the offset-fetch that requireStable flag. This is only > turned > > on > > > > by Streams so that if broker is not on newer version when "eos-beta" > is > > > > enabled on the client, we will fail fast . > > > > > > > > In that sense, a new client would not use old RPC talking to the > > brokers -- > > > > the key point is that, upon starting a task the offset fetch is > firstly > > > > sent before we start processing records, so that's where we should > > stop the > > > > client right away instead of silently reading unstable offsets which > > would > > > > already break the fencing guarantees if producers were not fenced on > > txn > > > > coordinator. In other words, we think that letting the overloaded > > > > producer#sendOffsetsToTxn to throw UnsupportedVersion exception is > > already > > > > too late, and hence we should not relying on that API to let clients > > detect > > > > if brokers are on older versions. > > > > > > > > For KIP-98, however, letting producer#initTxn to throw > > UnsupportedVersion > > > > makes total sense since that's the first API triggered before > > producers try > > > > to send any records. > > > > > > > > 3) With that in mind, I think letting the overloaded > > > > producer#sendOffsetsToTxn to not throw UnsupportedVersion is > > appropriate, > > > > since KIP-447 only applies to a consume -> process -> produce > scenario > > and > > > > does not apply to a producer-only scenario for transactional > messaging. > > > > Instead, the consumer API calls that trigger earlier would detect > older > > > > brokers. A side benefit of this is that, on the caller side (a.k.a > > Streams) > > > > we do not need to first detect the broker version and then decide > which > > > > overloaded `producer#sendOffsetsToTxn` to call based on that, which > > also > > > > means that we may eventually deprecate the old overloaded function > one > > day. > > > > > > > > > > > > Please let me know if you have any questions regarding this > rationale, > > and > > > > we can continue the discussion from there. > > > > > > > > > > > > Guozhang > > > > > > > > > > > > > > > > On Fri, Mar 27, 2020 at 9:21 PM Ismael Juma <ism...@juma.me.uk> > wrote: > > > > > > > >> I'm a bit puzzled. We added this feature because we thought it was > > useful. > > > >> And now we are saying that you don't know if you can rely on it > since > > the > > > >> downgrade happens silently. Can you provide more context on the > > OffsetFetch > > > >> downgrade? What is the implication of that? > > > >> > > > >> Ismael > > > >> > > > >> On Fri, Mar 27, 2020 at 7:30 PM Boyang Chen < > > reluctanthero...@gmail.com> > > > >> wrote: > > > >> > > > >>> Thanks Colin, I think the point of this change is to make the new > > client > > > >>> experience better while working with old brokers, when the upgrade > is > > > >>> happening on client side only. For Streams, it is very common to > have > > > >> more > > > >>> advanced client version. On code level, the path to call > transaction > > > >> commit > > > >>> is better to be unified instead of diverged with try-catch. As a > > matter > > > >> of > > > >>> fact, we have already implemented the downgrade of the OffsetFetch > > > >> protocol > > > >>> to adapt this broker version incompatibility. New transaction > commit > > > >>> protocol thus behaves inconsistent with what we did with other > > protocols. > > > >>> > > > >>> In terms of the impact for this last minute change, silent > downgrade > > has > > > >>> one downside which is the customized EOS applications could not > > alarm the > > > >>> user of semantic violation anymore by simply crashing. To be > honest, > > I'm > > > >>> not sure how severe this would affect customized community usages, > > as we > > > >>> don't have user stats for that yet :) > > > >>> > > > >>> Boyang > > > >>> > > > >>> On Fri, Mar 27, 2020 at 6:32 PM Colin McCabe <cmcc...@apache.org> > > wrote: > > > >>> > > > >>>> On Fri, Mar 27, 2020, at 18:29, Colin McCabe wrote: > > > >>>>> On Fri, Mar 27, 2020, at 16:06, Boyang Chen wrote: > > > >>>>>> Hey there, > > > >>>>>> > > > >>>>>> we would like to address an improvement on the > > > >>>>>> *Producer#sendOffsetsToTransaction(offsets, > > > >>>>>> groupMetadata) *API. Previously we assume the calling of this > > > >>> function > > > >>>>>> would crash the app if broker is not on version 2.5.0 or higher. > > As > > > >>>> Streams > > > >>>>>> side change was rolling out, the disadvantage of this behavior > > > >>> becomes > > > >>>>>> obvious, since on the application level it is hard to detect > > broker > > > >>>> version > > > >>>>>> and do separate call paths for old and new transaction commit > > > >>>> protocols. > > > >>>>> > > > >>>>> Hi Boyang, > > > >>>>> > > > >>>>> Applications are never supposed to detect broker versions, but > they > > > >> may > > > >>>>> be required to detect the presence or absence of a feature. That > > > >> would > > > >>>>> happen if a new feature was introduced that couldn't be emulated > > with > > > >>>>> some combination of the features that previously existed. > > > >>>>> > > > >>>>> That was the case for the original EOS work. For example, > > > >>>>> initTransactions will throw UnsupportedVersionException if the > > > >> producer > > > >>>>> tries to use it, but the broker doesn't support it. > > > >>>>> > > > >>>>>> > > > >>>>>> Thus, we are planning to change the behavior from crashing to > > > >>> silently > > > >>>>>> downgrade to the older transactional offset commit protocol by > > > >>>> resetting > > > >>>>>> the consumer group metadata fields. Details in this PR > > > >>>>>> <https://github.com/apache/kafka/pull/8375>. > > > >>>>>> > > > >>>>> > > > >>>>> The big question is will using the old client code with the old > RPC > > > >> be > > > >>>>> less safe than using the old client code with the old RPC? If it > > > >> will, > > > >>>>> there's a strong case to be made that streams should catch the > > > >>>>> UnsupportedVersionException and fall back to its old behavior. > > > >>>> > > > >>>> Sorry, that should read "new client code with old RPC". > Basically, > > are > > > >>> we > > > >>>> making things worse for users with older brokers and newer > clients? > > > >>>> > > > >>>> Colin > > > >>>> > > > >>>>> > > > >>>>> best, > > > >>>>> Colin > > > >>>>> > > > >>>>>> > > > >>>>>> Let me know if you have any questions, thanks! > > > >>>>>> > > > >>>>>> On Thu, Mar 26, 2020 at 12:45 PM Matthias J. Sax < > > mj...@apache.org > > > >>> > > > >>>> wrote: > > > >>>>>> > > > >>>>>>> One more change for KIP-447. > > > >>>>>>> > > > >>>>>>> Currently, Kafka Streams collects task-level metrics called > > > >>>>>>> "commit-latency-[max|avg]". However, with KIP-447 tasks don't > get > > > >>>>>>> committed individually any longer, and thus this metrics do not > > > >>> make > > > >>>>>>> sense going forward. > > > >>>>>>> > > > >>>>>>> Therefore, we propose to remove those metrics in 2.6 release. > > > >>>>>>> Deprecation does not make much sense, as there is just nothing > to > > > >>> be > > > >>>>>>> recorded in a useful way any longer. > > > >>>>>>> > > > >>>>>>> I also updated the upgrade path section, as the upgrade path is > > > >>>> actually > > > >>>>>>> simpler as described originally. > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> If there are any concerns, please let us know! > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> -Matthias > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> On 3/5/20 12:56 PM, Matthias J. Sax wrote: > > > >>>>>>>> There is one more change to this KIP for the upgrade path of > > > >>> Kafka > > > >>>>>>>> Streams applications: > > > >>>>>>>> > > > >>>>>>>> We cannot detect broker versions reliable, and thus, we need > > > >>> users > > > >>>> to > > > >>>>>>>> manually opt-in to the feature. Thus, we need to add a third > > > >>> value > > > >>>> for > > > >>>>>>>> configuration parameter `processing.guarantee` that we call > > > >>>>>>>> `exactly_once_beta` -- specifying this config will enable the > > > >>>> producer > > > >>>>>>>> per thread design. > > > >>>>>>>> > > > >>>>>>>> I updated the KIP accordingly. Please let us know if there are > > > >>> any > > > >>>>>>>> concerns. > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> -Matthias > > > >>>>>>>> > > > >>>>>>>> On 2/11/20 4:44 PM, Guozhang Wang wrote: > > > >>>>>>>>> Boyang, > > > >>>>>>>> > > > >>>>>>>>> Thanks for the update. This change makes sense to me. > > > >>>>>>>> > > > >>>>>>>>> Guozhang > > > >>>>>>>> > > > >>>>>>>>> On Tue, Feb 11, 2020 at 11:37 AM Boyang Chen > > > >>>>>>>>> <reluctanthero...@gmail.com> wrote: > > > >>>>>>>> > > > >>>>>>>>>> Hey there, > > > >>>>>>>>>> > > > >>>>>>>>>> we are adding a small change to the KIP-447 public API. The > > > >>>>>>>>>> default value of > > > >>>>>>>>>> ` > transaction.abort.timed.out.transaction.cleanup.interval.ms > > > >> ` > > > >>>>>>>>>> shall be changed from 1 minute to 10 seconds. The goal here > > > >> is > > > >>> to > > > >>>>>>>>>> trigger the expired transaction more frequently in order to > > > >>>>>>>>>> reduce the consumer pending offset fetch wait time. > > > >>>>>>>>>> > > > >>>>>>>>>> Let me know if you have further questions, thanks! > > > >>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>>> On Wed, Jan 8, 2020 at 3:44 PM Boyang Chen > > > >>>>>>>>>> <reluctanthero...@gmail.com> wrote: > > > >>>>>>>>>> > > > >>>>>>>>>>> Thanks Guozhang for another review! I have addressed all > the > > > >>>>>>>>>>> javadoc changes for PendingTransactionException in the KIP. > > > >>>>>>>>>>> For > > > >>>>>>>>>> FENCED_INSTANCE_ID > > > >>>>>>>>>>> the only thrown place would be on the new send offsets API, > > > >>>>>>>>>>> which is also addressed. > > > >>>>>>>>>>> > > > >>>>>>>>>>> Thanks Matthias for the vote! As we have 3 binding votes > > > >>>>>>>>>>> (Guozhang, > > > >>>>>>>>>> Jason, > > > >>>>>>>>>>> and Matthias), the KIP is officially accepted and prepared > > > >> to > > > >>>>>>>>>>> ship in > > > >>>>>>>>>> 2.5. > > > >>>>>>>>>>> > > > >>>>>>>>>>> Still feel free to put more thoughts on either discussion > or > > > >>>>>>>>>>> voting > > > >>>>>>>>>> thread > > > >>>>>>>>>>> to refine the KIP! > > > >>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>>>> On Wed, Jan 8, 2020 at 3:15 PM Matthias J. Sax > > > >>>>>>>>>>> <matth...@confluent.io> wrote: > > > >>>>>>>>>>> > > > >>>>>>>>>>>> I just re-read the KIP. Overall I am +1 as well. > > > >>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>>>>> Some minor comments (also apply to the Google design doc): > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 1) As 2.4 was release, references should be updated to > 2.5. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Addressed > > > >>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>>> 2) About the upgrade path, the KIP says: > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 2a) > > > >>>>>>>>>>>> > > > >>>>>>>>>>>>> Broker must be upgraded to 2.4 first. This means the > > > >>>>>>>>>>>> `inter.broker.protocol.version` (IBP) has to be set to the > > > >>>>>>>>>>>> latest. Any produce request with higher version will > > > >>>>>>>>>>>> automatically get fenced > > > >>>>>>>>>> because > > > >>>>>>>>>>>> of no support. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> From my understanding, this is not correct? After a broker > > > >> is > > > >>>>>>>>>>>> updated to the new binaries, it should accept new > requests, > > > >>>>>>>>>>>> even if IBP was not bumped yet? > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Your understanding was correct, after some offline > > > >> discussion > > > >>>>>>>>>>>> we should > > > >>>>>>>>>>> not worry about IBP in this case. > > > >>>>>>>>>>> > > > >>>>>>>>>>>> 2b) > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> About the two rolling bounces for KS apps and the > statement > > > >>>>>>>>>>>> > > > >>>>>>>>>>>>> one should never allow task producer and thread producer > > > >>>>>>>>>>>>> under the > > > >>>>>>>>>> same > > > >>>>>>>>>>>> application group > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> In the second rolling bounce, we might actually mix both > > > >> (ie, > > > >>>>>>>>>>>> per-task and per-thread producers) but this is fine as > > > >>>>>>>>>>>> explained in the KIP. The only case we cannot allow is, > old > > > >>>>>>>>>>>> per-task producers (without consumer generation fencing) > to > > > >>>>>>>>>>>> be mixed with per-thread producers (that rely solely on > > > >>>>>>>>>>>> consumer generation fencing). > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Does this sound correct? > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Correct, that's the purpose of doing 2 rolling bounce, > > > >> where > > > >>>>>>>>>>>> the first > > > >>>>>>>>>>> one is to guarantee everyone's opt-in for generation > > > >> fencing. > > > >>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> 3) We should also document how users can use KS 2.5 > > > >>>>>>>>>>>> applications against older brokers -- for this case, we > > > >> need > > > >>>>>>>>>>>> to stay on per-task producers and cannot use the new > > > >> fencing > > > >>>>>>>>>>>> mechanism. Currently, the KIP only describe a single way > > > >> how > > > >>>>>>>>>>>> users could make this work: by setting (and keeping) > > > >>>>>>>>>>>> UPGRADE_FROM config to 2.4 (what might not be an ideal > > > >>>>>>>>>>>> solution and might also not be clear by itself that people > > > >>>>>>>>>>>> would need to do > > > >>>>>>>>>> this)? > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Yes exactly, at the moment we are actively working on a > > > >> plan > > > >>>>>>>>>>>> to acquire > > > >>>>>>>>>>> broker's IBP during stream start-up and initialize based > off > > > >>>>>>>>>>> that information, so that user doesn't need to keep > > > >>>>>>>>>>> UPGRADE_FROM simply for working with > > > >>>>>>>>>> old > > > >>>>>>>>>>> brokers. > > > >>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> -Matthias > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> On 9/18/19 4:41 PM, Boyang Chen wrote: > > > >>>>>>>>>>>>> Bump this thread to see if someone could also review! > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> On Mon, Sep 9, 2019 at 5:00 PM Boyang Chen < > > > >>>>>>>>>> reluctanthero...@gmail.com> > > > >>>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>>> Thank you Jason! Addressed the comments. > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> Thank you Guozhang for explaining. I will document the > > > >>>>>>>>>>>>>> timeout > > > >>>>>>>>>> setting > > > >>>>>>>>>>>>>> reasoning in the design doc. > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> On Mon, Sep 9, 2019 at 1:49 PM Guozhang Wang > > > >>>>>>>>>>>>>> <wangg...@gmail.com> > > > >>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> On Fri, Sep 6, 2019 at 6:33 PM Boyang Chen < > > > >>>>>>>>>>>> reluctanthero...@gmail.com> > > > >>>>>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> Thanks Guozhang, I have polished the design doc to > > > >>>>>>>>>>>>>>>> make it sync > > > >>>>>>>>>> with > > > >>>>>>>>>>>>>>>> current KIP. As for overriding default timeout > > > >>>>>>>>>>>>>>>> values, I guess it's > > > >>>>>>>>>>>>>>> already > > > >>>>>>>>>>>>>>>> stated in the KIP to set txn timeout to 10s, are you > > > >>>>>>>>>>>>>>>> suggesting we > > > >>>>>>>>>>>>>>> should > > > >>>>>>>>>>>>>>>> also put down this recommendation on the KIP for > > > >>>>>>>>>>>>>>>> non-stream EOS > > > >>>>>>>>>>>> users? > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> My comment is not for changing any produce / consumer > > > >>>>>>>>>>>>>>>> default > > > >>>>>>>>>> config > > > >>>>>>>>>>>>>>> values, but for the Streams configs, to make sure that > > > >>>>>>>>>>>>>>> our overridden config values respect the above rules. > > > >>>>>>>>>>>>>>> That is, we check > > > >>>>>>>>>>>> the > > > >>>>>>>>>>>>>>> actual value used in the config if they are ever > > > >>>>>>>>>>>>>>> overridden by > > > >>>>>>>>>> users, > > > >>>>>>>>>>>> and > > > >>>>>>>>>>>>>>> if the above were not true we can log a warning that it > > > >>>>>>>>>>>>>>> may be risky > > > >>>>>>>>>>>> to > > > >>>>>>>>>>>>>>> encounter some unnecessary rebalances. > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> Again, this is not something we need to include in the > > > >>>>>>>>>>>>>>> KIP since it > > > >>>>>>>>>>>> is not > > > >>>>>>>>>>>>>>> part of public APIs, just to emphasize that the > > > >>>>>>>>>>>>>>> internal > > > >>>>>>>>>>>> implementation > > > >>>>>>>>>>>>>>> can have some safety guard like this. > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> Guozhang > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> Boyang > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> On Thu, Sep 5, 2019 at 8:43 PM Guozhang Wang > > > >>>>>>>>>>>>>>>> <wangg...@gmail.com> > > > >>>>>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> Hello Boyang, > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> Just realized one thing about timeout > > > >>>>>>>>>>>>>>>>> configurations that we > > > >>>>>>>>>> should > > > >>>>>>>>>>>>>>>>> consider including in this KIP as well: > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> 1) In Producer we have: max.block.ms (default value > > > >>>>>>>>>>>>>>>>> 60sec), request.timeout (30sec), > > > >>>>>>>>>>>>>>>>> delivery.timeout.ms (120sec), transaction.timeout > > > >>>>>>>>>> (60sec) > > > >>>>>>>>>>>>>>>>> 2) In Consumer we have: session.timeout (10sec), > > > >>>>>>>>>>>>>>>>> request.timeout > > > >>>>>>>>>>>>>>> (30sec), > > > >>>>>>>>>>>>>>>>> default.api.timeout.ms (60sec). > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> Within a transaction (i.e. after we've beginTxn), > > > >>>>>>>>>>>>>>>>> we could > > > >>>>>>>>>>>> potentially > > > >>>>>>>>>>>>>>>> call > > > >>>>>>>>>>>>>>>>> consumer blocking APIs that depend on > > > >>>>>>>>>>>>>>>>> default.api.timeout.ms, and > > > >>>>>>>>>>>>>>> call > > > >>>>>>>>>>>>>>>>> producer blocking APIs that depend on max.block.ms. > > > >>>>>>>>>>>>>>>>> Also, if the > > > >>>>>>>>>>>>>>> user is > > > >>>>>>>>>>>>>>>>> following a consumer->producer pattern, then it > > > >>>>>>>>>>>>>>>>> could be kicked > > > >>>>>>>>>> and > > > >>>>>>>>>>>>>>>> fenced > > > >>>>>>>>>>>>>>>>> either by txn or by consumer group session. > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> So we want to make sure that in the caller, e.g. > > > >>>>>>>>>>>>>>>>> Kafka Streams: > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> 1) transaction.timeout < max.block.ms 2) > > > >>>>>>>>>>>>>>>>> transaction.timeout < delivery.timeout.ms 3) > > > >>>>>>>>>>>>>>>>> transaction.timeout < default.api.timeout.ms 4) > > > >>>>>>>>>>>>>>>>> transaction.timeout ~= default.api.timeout.ms (I > > > >>>>>>>>>>>>>>>>> think this is > > > >>>>>>>>>>>>>>>> already > > > >>>>>>>>>>>>>>>>> mentioned in the KIP, just wanted to bring this up > > > >>>>>>>>>>>>>>>>> again) > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> We do not need to override the default since not > > > >>>>>>>>>>>>>>>>> every users are > > > >>>>>>>>>>>>>>>> following > > > >>>>>>>>>>>>>>>>> the consumer -> producer pattern, but in cases like > > > >>>>>>>>>>>>>>>>> Streams where > > > >>>>>>>>>> it > > > >>>>>>>>>>>>>>> is > > > >>>>>>>>>>>>>>>>> indeed the case, we should override the default > > > >>>>>>>>>>>>>>>>> values to obey the > > > >>>>>>>>>>>>>>> above > > > >>>>>>>>>>>>>>>>> rules. > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> Guozhang > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> On Thu, Sep 5, 2019 at 5:47 PM Guozhang Wang > > > >>>>>>>>>>>>>>>>> <wangg...@gmail.com> > > > >>>>>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> Thanks Boyang, I'm +1 on the KIP. > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> Could you also update the detailed design doc > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>> > > > >>>> > > https://docs.google.com/document/d/1LhzHGeX7_Lay4xvrEXxfciuDWATjpUXQh > > > >>>>>>>> rEIkph9qRE/edit > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>> which > > > >>>>>>>>>>>>>>>>>> seems a bit out-dated with the latest proposal? > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> Guozhang > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> On Wed, Sep 4, 2019 at 10:45 AM Boyang Chen < > > > >>>>>>>>>>>>>>>> reluctanthero...@gmail.com> > > > >>>>>>>>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> Hey all, > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> I would like to start the vote for KIP-447 < > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>> > > > >>>> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-447%3A+Producer > > > >>>>>>>> +scalability+for+exactly+once+semantics > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>> . > > > >>>>>>>>>>>>>>>>>>> This is a very important step to improve Kafka > > > >>>>>>>>>>>>>>>>>>> Streams > > > >>>>>>>>>> scalability > > > >>>>>>>>>>>>>>> in > > > >>>>>>>>>>>>>>>>>>> exactly-once semantics, by avoiding linearly > > > >>>>>>>>>>>>>>>>>>> increasing number > > > >>>>>>>>>> of > > > >>>>>>>>>>>>>>>>>>> producers with topic partition increases. > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> Thanks, Boyang > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> -- -- Guozhang > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> -- -- Guozhang > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> -- -- Guozhang > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>> > > > >>>>>>> > > > >>>>>> > > > >>>> > > > >>> > > > >> > > > > > > > > > > > > > > > > > Attachments: > > > * signature.asc > > > > > -- > -- Guozhang >