Hi Asaf, > After reaching the end of the PIP I discovered a link to an issue that > gives an example of a misconfiguration.
I’m sorry about this, currently, I’ve updated the PIP and added a link for the `misconfigured` client. > 1. Maybe we should put this code as a general wrapper for any command, to > avoid copy-pasting it for any other command we may need in the future. I > realize it's not an easy thing to do given the way the code is structured > today. The metric name you gave is generic enough so this can be refactored > as I said but in the future. Yes, the metrics name is very general, we can reuse it in the future. Say, if we want to add some others command execution latency or failed counter metrics, we don’t need another PIP. > 2. We provide some means for the operator to discover the issue, but > definitely not easy means. Also, we're not really solving any user > experience. In theory, we can say: > * There should be a reasonable delay between requests for this metadata, > why not enforce it? Perhaps apply a rate limit to this command which seems > reasonable by default > * The client can utilize the bi-directional nature of Pulsar Protocol and > get notified when partitions change (subscribe) therefore limiting the need > to pull based on frequency. I’m sorry that I can’t give you answer about this question, this will change our current client implementation. We need another PIP for this Thanks, Tao Jiuming > 2022年11月8日 下午10:56,Asaf Mesika <asaf.mes...@gmail.com> 写道: > > Hi Jiuming, > > I have a few questions: > > Motivation >> >> Currently, there's no way to track CommandPartitionedTopicMetadata >> requests. There's no metrics that indicate that a broker is handling >> CommandPartitionedTopicMetadata requests. >> >> Misconfigured clients might flood brokers with >> CommandPartitionedTopicMetadata requests and cause high CPU consumption. >> > I managed to understand the first line: You can't know how many such > requests each broker received. In the second line which is the problem > statement I couldn't figure out: > 1. What do you mean by misconfigured clients? What configuration can you > change in a wrong way that will cause the brokers with such requests? > 2. Clients ask a random broker for that metadata? > > So for now, you say that by having the metrics in place, I'll be able to > set up an alert on a spike of such requests, correlate that with the high > CPU, and then do what - reconfigure the client somehow to avoid sending > those requests? > > After reaching the end of the PIP I discovered a link to an issue that > gives an example of a misconfiguration. > Perhaps then it would be good to copy the example or at least link to it in > the motivation section to say - here is a link that contains an example (I > prefer to inline it myself). > > Other than that, I think it's good. > > I do have a thought I want to share, to hear what people think: > > When I read it a couple of things occurred to me: > 1. Maybe we should put this code as a general wrapper for any command, to > avoid copy-pasting it for any other command we may need in the future. I > realize it's not an easy thing to do given the way the code is structured > today. The metric name you gave is generic enough so this can be refactored > as I said but in the future. > > 2. We provide some means for the operator to discover the issue, but > definitely not easy means. Also, we're not really solving any user > experience. In theory, we can say: > * There should be a reasonable delay between requests for this metadata, > why not enforce it? Perhaps apply a rate limit to this command which seems > reasonable by default > * The client can utilize the bi-directional nature of Pulsar Protocol and > get notified when partitions change (subscribe) therefore limiting the need > to pull based on frequency. > > Thanks, > > Asaf > > > > On Thu, Nov 3, 2022 at 12:19 PM Jiuming Tao <jm...@streamnative.io.invalid> > wrote: > >> Hi pulsar community, >> >> I’ve opened a PIP to discuss: PIP-222: Add CommandPartitionedTopicMetadata >> metrics >> >> The PIP link: https://github.com/apache/pulsar/issues/18319 < >> https://github.com/apache/pulsar/issues/18319> >> >> Thanks, >> Tao Jiuming