2. Sounds good. Thanks! Guozhang
On Mon, Jul 29, 2019 at 11:43 AM Jason Gustafson <ja...@confluent.io> wrote: > @Colin > > Yeah, that's a good question. If the version is higher, I think it's not > safe for the leader to use the new version until it has received the full > LeaderAndIsr state. For example, there may be a reassignment in progress > which could alter the leader's expected state change. At least knowing > about the higher version saves the leader unnecessary retries and will be > useful for debugging missed LeaderAndIsr updates. I considered letting the > AlterIsr response include the full leader and ISR state so that the leader > could continue without relying on the update. However, I think this would > just tend to mask bugs in the controller's propagation logic. It seems > simpler to have one mechanism for propagation of leader and ISR state > rather than two. > > @Guozhang > > 1. Yes, it is the same version field used in LeaderAndIsr requests. I will > make this explicit. > 2. I think both options can work. The main thing I'm trying to avoid is > having the leader blocking on an ISR update. At some point, if the leader > doesn't receive an expected LeaderAndIsr update, then it must retry the > AlterIsr request. I thought it would be simpler if retries always reflected > the latest expectation on the ISR state rather than letting the leader be > stuck retrying a state change which may no longer be relevant. This is the > approach that I modeled. That being said, if it's ok with you, I'd prefer > to leave this decision to the implementation. I think the main point is > that once the leader receives the latest update, it can discard any pending > state. > > > Thanks, > Jason > > > > On Mon, Jul 29, 2019 at 9:22 AM Guozhang Wang <wangg...@gmail.com> wrote: > > > Hi Jason, > > > > Thanks for the KIP. It looks good to me overall. > > > > 1. Just clarifying the "CurrentVersion" field in the newly proposed > > protocol. Does it contains the same value as zkVersion that controller > get > > from ZK? > > > > 2. As for the comment in the KIP: "We can either await the update or we > can > > send a new update with the current version. If we did the latter, then we > > would have multiple pending inflights using the same version." My > > understanding is that it is the controller who acts as the source of > truth > > for "currentVersion", in which case I think there's little latency > benefits > > to send multiple pending requests with the same version, since which-ever > > arrives controller first would cause the zkVersion to be bumped and > > therefore the rest of the requests would be rejected with > > "INVALID_ISR_VERSION". So I'd favor just wait the update response from > the > > current inflight request before sending out the next request -- > admittedly > > this requires a bit more complicated implementation on the brokers, but > > maybe we can generalize the request queue module on controller for this > > purpose? > > > > > > Guozhang > > > > > > On Sun, Jul 28, 2019 at 10:32 AM Colin McCabe <cmcc...@apache.org> > wrote: > > > > > Hi Jason, > > > > > > This looks good. > > > > > > If the AlterIsr request returns a higher ZK version than the one the > > > broker currently has, will the broker use that as its new ZK version? > I > > > suppose this could happen if some of the updates the controller pushed > > out > > > were not received or not received yet by the broker in question. > > > > > > best, > > > Colin > > > > > > > > > On Fri, Jul 26, 2019, at 09:43, Jason Gustafson wrote: > > > > Hi All, > > > > > > > > I have written a proposal to change the way leaders make ISR > > > > modifications: > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-497%3A+Add+inter-broker+API+to+alter+ISR > > > . > > > > Have a look and let me know what you think. > > > > > > > > Thanks, > > > > Jason > > > > > > > > > > > > > -- > > -- Guozhang > > > -- -- Guozhang