Following up here, I am still in need of reviews for PR [0]. It introduces an important clarification to the Pulsar Protocol Spec. Please take a look, if you are able.
[0] - https://github.com/apache/pulsar/pull/12948 Thanks! Michael On Tue, Nov 23, 2021 at 1:10 PM Michael Marshall <mikemars...@gmail.com> wrote: > > I created a PR to update the protocol's documentation: > https://github.com/apache/pulsar/pull/12948. Please take a look, if > you're able. > > Once this PR is accepted/merged, I will follow up with an update to > the Java client. > > - Michael > > On Thu, Nov 18, 2021 at 1:29 PM Michael Marshall <mikemars...@gmail.com> > wrote: > > > > I view this as an edge case of the Pulsar Protocol that requires > > clarification. Once we clarify the spec, we can update the clients to > > conform to the spec. I don't think we need to give end users control > > over this small part of the protocol. > > > > After reviewing the design a bit more, I think we should update the > > Java client to send the `CloseProducer` command, as well as update the > > spec to recommend this implementation. While the `ServerCnx` class > > "works" both ways, the current Java client implementation leads to > > unnecessary calls, unnecessary errors, and a longer time to recovery. > > > > Specifically, if the client fails to send a `CloseProducer` command, > > it ends up getting into a sequence of retries where each new > > `Producer` command receives an immediate `ErrorResponse` because the > > `ServerCnx` already has a pending producer. By sending a > > `CloseProducer` command, the client gives the broker permission to > > stop keeping track of the original create producer request. It also > > means that if the topic eventually loads, the broker will respond to > > the right request id with a `ProducerSuccessResponse` command. > > > > I will follow up with an update to the client and the protocol spec, > > unless there are any objections. > > > > Thanks, > > Michael > > > > On Thu, Nov 18, 2021 at 12:25 PM Neng Lu <nl...@apache.org> wrote: > > > > > > How about making the behavior when timeout configurable? And by default, > > > it will send the `CloseProducer` cmd? > > > > > > On 2021/11/17 05:52:21 Michael Marshall wrote: > > > > Hi All, > > > > > > > > I noticed that the `ServerCnxTest#testCreateProducerTimeout` test > > > > indicates that a producer will send a `CloserProducer` command when > > > > producer creation times out for the client. > > > > > > > > The Java client does not follow this protocol. When the producer > > > > creation times out, it just schedules another attempt to create the > > > > producer. > > > > > > > > The C++ client does follow this protocol and is annotated with the > > > > following comment: > > > > > > > > > // Creating the producer has timed out. We need to ensure the broker > > > > > closes the producer > > > > > // in case it was indeed created, otherwise it might prevent new > > > > > create producer operation, > > > > > // since we are not closing the connection > > > > > > > > I don't see anything in our official protocol spec indicating the > > > > "right" behavior. Given the test coverage, it seems like the initial > > > > design was to expect a `CloseProducer` command. However, I also see that > > > > the broker's `ServerCnx` class will reply to a redundant `Producer` > > > > command with a `ProducerSuccess` command, as long as the producer > > > > is already created. > > > > > > > > Should I submit a PR to update the Java client to send a > > > > `CloseProducer` command when a `Producer` command times out? > > > > > > > > Thanks, > > > > Michael > > > >