Hi, Mayank, Thanks for the KIP. I agree with others that it would be useful to see the performance results. Otherwise, just a minor comment. If we don't plan to bump up the FetchResponse version, we could just remove the reference to version 16.
Jun On Wed, Jul 19, 2023 at 2:31 PM Mayank Shekhar Narula < mayanks.nar...@gmail.com> wrote: > Luke > > Thanks for the interest in the KIP. > > But what if the consumer was fetching from the follower? > > We already include `PreferredReadReplica` in the fetch response. > > Should we put the node info of PreferredReadReplica under this case, > > instead of the leader's info? > > > > PreferredReadReplica is the decided on the leader. Looking at the Java > client code, AbstractFetch::selectReadReplica, first fetch request goes to > Leader of the partition -> Sends back PreferredReadReplica -> Next fetch > uses PreferredReadReplica. So as long as leader is available, > PreferredReadReplica would be found in subsequent fetches. > > Also, under this case, should we include the leader's info in the response? > > > In this case, I think the follower would fail the fetch if it knows a > different leader. If the follower knows a newer leader, it would return new > leader information in the response, for the client to act on. > > > Will we include the leader/node info in the response when having > > `UNKNOWN_LEADER_EPOCH` error? > > > My understanding is UNKNOWN_LEADER_EPOCH when a request from a client has a > newer epoch than the broker. So the client is already up to date on new > leader information, it's the broker that has the catching up to do. I think > there might be some optimisations to make sure the broker refreshes its > metadata quickly, so it can quickly recover to handle requests that > previously returned UNKNOWN_LEADER_EPOCH. But this work is outside the > scope of this KIP, as for now this KIP focusses on client-side > optimisations. > > Mayank > > On Tue, Jul 18, 2023 at 8:51 AM Luke Chen <show...@gmail.com> wrote: > > > Hi Mayank, > > > > Thanks for the KIP! > > > > Some questions: > > 1. I can see most of the cases we only care about consumer fetch from the > > leader. > > But what if the consumer was fetching from the follower? > > We already include `PreferredReadReplica` in the fetch response. > > Should we put the node info of PreferredReadReplica under this case, > > instead of the leader's info? > > Also, under this case, should we include the leader's info in the > response? > > > > 2. Will we include the leader/node info in the response when having > > `UNKNOWN_LEADER_EPOCH` error? > > I think it's fine we ignore the `UNKNOWN_LEADER_EPOCH` error since when > > this happens, the node might have some error which should refresh the > > metadata. On the other hand, it might also be good if we can heal the > node > > soon to do produce/consume works. > > > > > > Thank you. > > Luke > > > > On Tue, Jul 18, 2023 at 2:00 AM Philip Nee <philip...@gmail.com> wrote: > > > > > Hey Mayank: > > > > > > For #1: I think fetch and produce behave a bit differently on metadata. > > > Maybe it is worth highlighting the changes for each client in detail. > In > > > producer did you mean by the metadata timeout before sending out > produce > > > requests? For consumer: I think for fetches it requires user to retry > if > > > the position does not exist on the leader. I don't have the detail on > top > > > of my head, but I think we should lay out these behavioral changes. > > > > > > For #3: Thanks for the clarification. > > > > > > On Mon, Jul 17, 2023 at 10:39 AM Mayank Shekhar Narula < > > > mayanks.nar...@gmail.com> wrote: > > > > > > > Philip > > > > > > > > 1. Good call out about "poll" behaviour, my understanding is the > same. > > I > > > am > > > > assuming it's about the motivation of the KIP. There with async, my > > > > intention was to convey that the client doesn't wait for the > > > > metadata-refresh before a subsequent retry of the produce or fetch > > > request > > > > that failed due to stale metadata(i.e. going to an old leader). The > > only > > > > wait client has is the configured retry-delay. > > > > > > > > 2. Yes, in theory other APIs could benefit from this too. But that is > > > > outside of the scope of the KIP. > > > > > > > > 3. Do you mean the response for the Metadata RPC? I think brokers > > always > > > > have a view of the cluster, although it can be stale,it would always > > > return > > > > a leader(whether old or new). > > > > > > > > Mayank > > > > > > > > On Fri, Jul 14, 2023 at 8:53 PM Philip Nee <philip...@gmail.com> > > wrote: > > > > > > > > > Hey Mayank, > > > > > > > > > > Thanks for the KIP. I think this is a great proposal, and I'm in > > favor > > > > > of this idea. A few comments: > > > > > > > > > > 1. Claiming metadata refresh is done asynchronously is misleading. > > The > > > > > metadata refresh requires Network Client to be physically polled, > > which > > > > is > > > > > done in a separate thread in Producer and Admin Client (IIRC!) but > > not > > > > > Consumer. > > > > > 2. There are other API calls that might result in > > > NOT_LEADER_OR_FOLLOWER > > > > > response, but it seems like this KIP only wants to update on fetch > > and > > > > > produce. Do we want to make the leader information available for > > other > > > > API > > > > > calls? > > > > > 3. Do you know what would happen during a leader election? I'm not > > sure > > > > > about this process and I wonder if the current metadata response > uses > > > the > > > > > old leader or null as the leader isn't readily available yet. > > > > > > > > > > Thanks, > > > > > P > > > > > > > > > > On Fri, Jul 14, 2023 at 11:30 AM Kirk True <k...@kirktrue.pro> > > wrote: > > > > > > > > > > > Hi Mayank, > > > > > > > > > > > > > On Jul 14, 2023, at 11:25 AM, Mayank Shekhar Narula < > > > > > > mayanks.nar...@gmail.com> wrote: > > > > > > > > > > > > > > Kirk > > > > > > > > > > > > > > > > > > > > >> Is the requested restructuring of the response “simply” to > > > preserve > > > > > > bytes, > > > > > > >> or is it possible that the fetch response could/should/would > > > return > > > > > > >> leadership changes for partitions that we’re specifically > > > requested? > > > > > > >> > > > > > > > > > > > > > > Moving endpoints to top-level fields would preserve bytes, > > > otherwise > > > > > the > > > > > > > endpoint-information would be duplicated if included with the > > > > > > > partition-data in the response. Right now, the top-level field > > will > > > > > only > > > > > > be > > > > > > > set in case leader changes for any requested partitions. But it > > can > > > > be > > > > > > > re-used in the future, for which Jose has a use-case in mind > > shared > > > > up > > > > > in > > > > > > > the thread. KIP is now upto date with endpoint info being at > > > > top-level. > > > > > > > > > > > > > > > > > > I didn’t catch before that there was a separate section for the > > full > > > > node > > > > > > information, not just the ID and epoch. > > > > > > > > > > > > Thanks! > > > > > > > > > > > > >>> 3. In the future, I may use this information in the > > > KRaft/Metadata > > > > > > >>> implementation of FETCH. In that implementation not all of > the > > > > > > >>> replicas are brokers. > > > > > > >> > > > > > > >> Side point: any references to the change you’re referring to? > > The > > > > idea > > > > > > of > > > > > > >> non-brokers serving as replicas is blowing my mind a bit :) > > > > > > >> > > > > > > >> > > > > > > > Jose, I missed this as well, would love to know more about > > > non-broker > > > > > > > serving as replica! > > > > > > > -- > > > > > > > Regards, > > > > > > > Mayank Shekhar Narula > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Regards, > > > > Mayank Shekhar Narula > > > > > > > > > > > > -- > Regards, > Mayank Shekhar Narula >