I created a PR to update the protocol's documentation: https://github.com/apache/pulsar/pull/12948. Please take a look, if you're able.
Once this PR is accepted/merged, I will follow up with an update to the Java client. - Michael On Thu, Nov 18, 2021 at 1:29 PM Michael Marshall <mikemars...@gmail.com> wrote: > > I view this as an edge case of the Pulsar Protocol that requires > clarification. Once we clarify the spec, we can update the clients to > conform to the spec. I don't think we need to give end users control > over this small part of the protocol. > > After reviewing the design a bit more, I think we should update the > Java client to send the `CloseProducer` command, as well as update the > spec to recommend this implementation. While the `ServerCnx` class > "works" both ways, the current Java client implementation leads to > unnecessary calls, unnecessary errors, and a longer time to recovery. > > Specifically, if the client fails to send a `CloseProducer` command, > it ends up getting into a sequence of retries where each new > `Producer` command receives an immediate `ErrorResponse` because the > `ServerCnx` already has a pending producer. By sending a > `CloseProducer` command, the client gives the broker permission to > stop keeping track of the original create producer request. It also > means that if the topic eventually loads, the broker will respond to > the right request id with a `ProducerSuccessResponse` command. > > I will follow up with an update to the client and the protocol spec, > unless there are any objections. > > Thanks, > Michael > > On Thu, Nov 18, 2021 at 12:25 PM Neng Lu <nl...@apache.org> wrote: > > > > How about making the behavior when timeout configurable? And by default, it > > will send the `CloseProducer` cmd? > > > > On 2021/11/17 05:52:21 Michael Marshall wrote: > > > Hi All, > > > > > > I noticed that the `ServerCnxTest#testCreateProducerTimeout` test > > > indicates that a producer will send a `CloserProducer` command when > > > producer creation times out for the client. > > > > > > The Java client does not follow this protocol. When the producer > > > creation times out, it just schedules another attempt to create the > > > producer. > > > > > > The C++ client does follow this protocol and is annotated with the > > > following comment: > > > > > > > // Creating the producer has timed out. We need to ensure the broker > > > > closes the producer > > > > // in case it was indeed created, otherwise it might prevent new create > > > > producer operation, > > > > // since we are not closing the connection > > > > > > I don't see anything in our official protocol spec indicating the > > > "right" behavior. Given the test coverage, it seems like the initial > > > design was to expect a `CloseProducer` command. However, I also see that > > > the broker's `ServerCnx` class will reply to a redundant `Producer` > > > command with a `ProducerSuccess` command, as long as the producer > > > is already created. > > > > > > Should I submit a PR to update the Java client to send a > > > `CloseProducer` command when a `Producer` command times out? > > > > > > Thanks, > > > Michael > > >