Thank Rajan, will reply on the PR. https://github.com/apache/pulsar/pull/11627/
On Wed, Aug 11, 2021 at 10:06 AM Rajan Dhabalia <dhabalia...@gmail.com> wrote: > > *The history behind introducing TooManyRequest error is to handle > backpressure for zookeeper by throttling a large number of concurrent > topics loading during broker cold restart. Therefore, pulsar has lookup > throttling at both client and server-side that slows down lookup because > lookup ultimately triggers topic loading at server side. So, when a client > sees TooManyRequest errors, the client should retry to perform this > operation and the client will eventually reconnect to the broker, > TooManyRequest can not harm the broker because broker already has a > safeguard to reject the flood of the requests. I am not sure what problem > https://github.com/apache/pulsar/pull/6584 > <https://github.com/apache/pulsar/pull/6584> PR tries to solve but it > should not solve it by making TooManyRequest non-retriable. TooManyRequest > is a retriable error and the client should retry. Also, it should > definitely not close the producer/consumer due to this error otherwise it > can bring down the entire application which depends on the availability of > the pulsar client entities.Pulsar lookup is an operation similar to other > operations such as: connect, publish, subscribe, etc. So, I don’t think it > needs special treatment with a separate timeout config and we can avoid the > complexity introduced in PR #11627 that caches and depends on the > previously seen exception for lookup retry. Anyways, removing > TooManyRequest from the non-retriable error list will simplify the client > behavior and we can avoid the complexity of PR: #11627 > <https://github.com/apache/pulsar/pull/11627/>Thanks,Rajan* > > On Mon, Aug 9, 2021 at 12:54 AM Ivan Kelly <iv...@apache.org> wrote: > > > > Suppose you have about a million topics and your Pulsar cluster goes down > > > (say, ZK down). ..many millions of producers and consumers are now > > > anxiously awaiting the cluster to comeback. .. fun experience for the > > first > > > broker that comes up. Don't ask me how I know, ref blame > > > ServerCnx.java#L429 > > > < > > https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java#L429 > > >. > > > The broker limit was added to get through a cold restart. > > > > Ok. Makes sense. The scenarios we've been seeing issues with have had > > modest numbers of topics. > > > > -Ivan > >