Hi, I updated the PIP to add more details about the recent discussion.
Thanks, Heesung On Thu, Aug 3, 2023 at 6:46 PM Heesung Sohn <heesung.s...@streamnative.io> wrote: > > > On Wed, Aug 2, 2023 at 8:02 PM PengHui Li <peng...@apache.org> wrote: > >> > One of the advantages of this proposal (add these optional dstBrokerUrl >> fields in the existing CommandCloseProducer and CommandCloseConsumer) is >> the backward compatibility >> >> Ah, yes. I missed this point. It's a good convince. >> >> > After the client connects to the destination broker, the next >> command(e.g. >> > ProducerCommand) requires >> > the destination broker to check the ownership again against its local >> table >> > view of the bundle state channel. >> >> > Upon this local ownership check, there could be the following scenarios: >> >> > Happy case: >> > - If the ownership state is `owned ` and the current broker is indeed >> the >> > owner, the command completes. >> > Unhappy cases: >> > - If the ownership state is `owned ` and the current broker is not the >> > owner, the command fails >> > (the broker returns an error to the client), and the client tries to >> find >> > the true new owner by lookups. >> > The global bundle state channel is eventually consistent, and the >> lookups >> > should eventually converge. >> > - if the ownership change is still in progress(`releasing`, >> `assigning`), >> > this check will be deferred >> > until the state becomes `owned` with a timeout. >> >> Could you please also update the explanation in the proposal? >> It looks like what is guaranteed and what is not in the new proposal. >> And why the mechanism can work with the unhappy case. >> >> Both the existing mechanism and the new mechanism follow the >> eventual consistency. The nuance is that the new mechanism will change >> the bundle owner first and then unload the topics. For the broker logs, >> will more error messages related to the fenced ledger be introduced? >> Or the client reconnects will happen after the owner cache is updated? >> >> > As described in the protocol change in the PIP, the goal is to have fewer > fenced ledger errors because the protocol change helps the client > reconnection(open ledger) > happen at the very end. > > >> If the load manager crashed before unloading the topics (close producers >> and consumers). >> Will the old broker provide service for this topic until the client >> triggers a >> new lookup? >> > > Sorry, I am not sure if I understood your question here, but > When the destination or source broker crashes in the middle, > the leader will find the orphan state and clean the ownership by selecting > a new owner, and the client will reconnect to it. > During this transfer process, if alive, the source broker will serve the > topic according to the protocol described in the PIP. > > >> Sorry, Another question about PIP-192. Is it required by the new >> mechanism? >> Will the situation also be improved without enabling PIP-192? >> We have added the new owner broker to the close producer and consumer >> command. >> We can also update the owner and then close the producer and consumers >> with >> the new owner. >> > > This PIP is based on PIP-192 because the proposed protocol change is based > on the bundle states from PIP-192. > I assume the community could raise other PIPs to utilize the close > producer and consumer command changes for other logic too. > > >> >> Regards, >> Penghui >> >> On Thu, Aug 3, 2023 at 4:38 AM Heesung Sohn >> <heesung.s...@streamnative.io.invalid> wrote: >> >> > Hi PengHui, >> > Thank you for your questions. Please find my answers inline. >> > >> > On Wed, Aug 2, 2023 at 3:47 AM PengHui Li <peng...@apache.org> wrote: >> > >> > > Hi Heesung, >> > > >> > > I'm sorry for not getting back to you sooner. >> > > The motivation and the solution look good to me. >> > > >> > > I just want to leave some comments to make the proposal clear >> > > for users and developers. >> > > >> > > >> > > >> > >> ---------------------------------------------------------------------------------------------------- >> > > >> > > We should have a section to describe the values that will be added >> > > by this proposal. >> > > >> > > - Reduced the publish latency spike and e2e latency spike during the >> > > reconnecting >> > > - Mitigated bottleneck(lookup) of a large number of topics in a >> cluster >> > > >> > > Maybe more, I can only think about the above two values. >> > > >> > > >> > Good point. I will clarify this part in the PIP. >> > >> > >> > >> ---------------------------------------------------------------------------------------------------- >> > > >> > > For the proto changes, the client side can also use the HTTP lookup >> > > service. >> > > We should also mention it in this proposal. Instead of adding four new >> > > fields to >> > > the CommandCloseProducer and CommandCloseConsumer, maybe adding a new >> > > message BrokerIdentifier, or others is better. >> > > >> > >> > I am worried about the backward compatibility if we add a new proto >> message >> > -- how can the old versions of clients handle this new proto message? >> > >> > One of the advantages of this proposal (add these optional dstBrokerUrl >> > fields in the existing CommandCloseProducer and CommandCloseConsumer) is >> > the backward compatibility -- older clients can still work in the older >> way >> > by going through the lookup requests. >> > >> > Ref: https://protobuf.dev/overview/#updating-defs >> > Updating Proto Definitions Without Updating Code >> > <https://protobuf.dev/overview/#updating-defs> >> > >> > It’s standard for software products to be backward compatible, but it is >> > less common for them to be forward compatible. As long as you follow >> > some simple >> > practices <https://protobuf.dev/programming-guides/proto3/#updating> >> when >> > updating .proto definitions, old code will read new messages without >> > issues, ignoring any newly added fields. ... >> > >> > >> > > >> > > >> > >> ---------------------------------------------------------------------------------------------------- >> > > >> > > There are two points we need to clarify in the proposal. >> > > >> > > - If the client reconnects before the new owner takes ownership. If >> the >> > > producer closed >> > > after the new broker took ownership, would it be a potential >> > consistency >> > > issue? >> > >> > - If the owner changed again before the client reconnection. How did >> Pulsar >> > > handle this case? >> > > As I understand, it's fine because the broker will check the >> ownership >> > > again when handling >> > > producer and consumer creation. If it can be explained in the >> > proposal, >> > > it should be great >> > > for developers to understand it deeply. >> > > >> > >> > After the client connects to the destination broker, the next >> command(e.g. >> > ProducerCommand) requires >> > the destination broker to check the ownership again against its local >> table >> > view of the bundle state channel. >> > >> > Upon this local ownership check, there could be the following scenarios: >> > >> > Happy case: >> > - If the ownership state is `owned ` and the current broker is indeed >> the >> > owner, the command completes. >> > Unhappy cases: >> > - If the ownership state is `owned ` and the current broker is not the >> > owner, the command fails >> > (the broker returns an error to the client), and the client tries to >> find >> > the true new owner by lookups. >> > The global bundle state channel is eventually consistent, and the >> lookups >> > should eventually converge. >> > - if the ownership change is still in progress(`releasing`, >> `assigning`), >> > this check will be deferred >> > until the state becomes `owned` with a timeout. >> > >> > >> > > Regards, >> > > Penghui >> > > >> > > On Wed, Aug 2, 2023 at 1:18 AM Heesung Sohn >> > > <heesung.s...@streamnative.io.invalid> wrote: >> > > >> > > > Hi, >> > > > >> > > > It has been over three weeks since this discussion started, and I >> > replied >> > > > to the questions in the PIP. >> > > > I will send out a vote email tomorrow if looking good. >> > > > >> > > > Thanks, >> > > > Heesung >> > > > >> > > > On Tue, Jul 25, 2023 at 1:12 PM Heesung Sohn < >> > > heesung.s...@streamnative.io >> > > > > >> > > > wrote: >> > > > >> > > > > Hi, >> > > > > >> > > > > I replied to the comments from Kai and Ran in the PIP. >> > > > > >> > > > > Please review the PIP by any chance. >> > > > > >> > > > > Thank you, >> > > > > Heesung >> > > > > >> > > > > On Thu, Jul 6, 2023 at 4:32 PM Heesung Sohn < >> > > > heesung.s...@streamnative.io> >> > > > > wrote: >> > > > > >> > > > >> Hi dev, >> > > > >> >> > > > >> I proposed this PIP, https://github.com/apache/pulsar/pull/20748 >> , >> > to >> > > > >> make unloaded clients directly and gracefully connect to new >> owner >> > > > brokers >> > > > >> without lookups. >> > > > >> >> > > > >> Please take a look and let me know what you think. >> > > > >> >> > > > >> Regards, >> > > > >> Heesung >> > > > >> >> > > > > >> > > > >> > > >> > >> >