Hi all, I realized that sampling outlier topic/subscription/consumer/producer metrics can be useful to operate Pulsar.
I noticed that Pulsar's performance issues are mostly caused by outlier topics/subscription/consumers/producers. For example, disk usage can grow a lot when consumers are very slow(0 ack rate and 0 permit) while producers publish messages at a high rate to the same topic. I think sampling such outlier metrics can be useful to monitor Pulsar. Do we think this is a separate issue and deserves another PIP or should it be part of this PIP-264 work? Thanks, Heesung On Mon, Aug 28, 2023 at 7:48 AM Asaf Mesika <asaf.mes...@gmail.com> wrote: > > I've relocated the PIP content from the issue ( > https://github.com/apache/pulsar/issues/20197) to a PR ( > https://github.com/apache/pulsar/pull/21080) so I could add TOC and also be > inlined with the new process. > > > > On Mon, Aug 28, 2023 at 5:46 PM Asaf Mesika <asaf.mes...@gmail.com> wrote: > > > Thanks for taking the time to review the document - *highly appreciated*. > > I'm inlined my comments below. > > > > > > On Mon, Aug 21, 2023 at 12:19 PM Hang Chen <chenh...@apache.org> wrote: > > > >> Hi Asaf, > >> Thanks for bring up the great proposal. > >> > >> After reading this proposal, I have the following questions. > >> 1. This proposal will introduce a big break change in Pulsar, > >> especially in code perspective. I’m interested in how to support both > >> old and new implementation at the same time step by step? > >> > >> >We will keep the current metric system as is, and add a new layer of > >> metrics using OpenTelemetry Java SDK. All of Pulsar’s metrics will be > >> create also using OpenTelemetry. A feature flag will allow enabling > >> OpenTelemetry metrics (init, recording and exporting). All the features and > >> changes described here will be done only in the OpenTelemetry layer, > >> allowing to keep the old version working until you’re ready to switch using > >> the OTel (OpenTelemetry) implementation. In the far future, once OTel usage > >> has stabilized and became widely adopted we’ll deprecate current metric > >> system and eventually remove it. We will also make sure there is feature > >> flag to turn off current Prometheus based metric system. > >> > >> > > Current metrics code remains as is, untouched. > > I'm adding new code, using OpenTelemetry API and SDK. The code in most > > cases will read the existing variables (like msgsReceived), and in other > > cases will setup its own new objects like Counter, Histogram and *also* > > record values to them. > > You can take a look at the revised PIP as I've added tiny code sample to > > be use as an idea how it will look like. Look here > > <https://github.com/apache/pulsar/blob/6ec0bde4127a54ab8e8bb67fb091c932fa2952a4/pip/pip-264.md#consolidating-to-opentelemetry> > > . > > > > > > > >> 2. We introduced Group and filter logic in the metric system, and I > >> have the following concerns. > >> - We need to add protection logic or pre-validation for the group and > >> filter rules to avoid users mis-configured causes huge performance > >> impaction on Pulsar brokers > >> > >> > > Good call. I've added a note in the PIP, that we will reject any filter > > rules update if the expected number of data points exceed certain > > threshold. I left this as detail to be specified in the sub-PIP. > > > > - We need to support expose all the topic-level metrics when the > >> Pulsar cluster just has thounds of topics > >> > >> I've added a new goal: "- New system should support the maximum > > supported number of topics in current system (i.e. 4k topics) without > > filtering" > > > > > >> - Even though we introduced group and filter for the metrics, we still > >> can’t resolve large number of metrics exposed to Prometheus. Exposing > >> large a mount of data (100MB+) throughput HTTP endpoint in > >> ineffective. We can consider expose those metric data by Pulsar topic > >> and develop a Pulsar to Prometheus connector to write Pulsar metric > >> data to Prometheus in streaming mode instead of batch mode to reduce > >> the performace impaction > >> > > > > As I wrote in my PIP. If you find your self exporting 100MB or more of > > metric data *every* 30 seconds you will suffer from: > > * High cost of TSDB holding that (e.g. Prometheus, Cortex, VictoriaMetrics) > > * Query time out since there is too much data to read > > > > Also, the bottleneck is not transfer time over the wire. It's mostly the > > memory needed by any TSDB to hold it for at least 2 hours before flushing > > it to disk - this it the most expensive of all. > > > > At 100mb response size, filtering and grouping are a must. > > > > > > > >> > >> - Group and filter logic uses regular expressions extensively in > >> rules. Regular expression parsing and matching are CPU and time > >> intensive operations. We have push-down filter to reduce the generated > >> metrics number, but still can’t solve the regular expression matching > >> issues. If the user provide a complex regular expression for group and > >> filter rule, the metric generating thread will be the bottleneck and > >> will block other threads if we use synchronous call. > >> > >> > > I plan to use caching as wrote in the PIP. Roughly (instrument, > > attributes) -> boolean. It's basically as if we are adding one boolean to > > PersistentTopic class - it has so many properties and size added is > > negligible. > > > > > >> - Group and filter rule is a litter complex for users and we need to > >> provide a UI or tool to help user write the correct and effective > >> rules and show the new rules impaction on old rules. > >> > > > > We don't have UI for now in Pulsar. We will make sure pulsar CLI will be > > convenient enough. > > > > > >> > >> > >> Thanks, > >> Hang > >> > >> Matteo Merli <matteo.me...@gmail.com> 于2023年6月15日周四 23:14写道: > >> > > >> > > Proposing a large breaking change (even if it's crucial) is the single > >> > fastest way to motivate your users to migrate to a different platform. I > >> > wish it wasn't the case, but it's the cold reality. > >> > > >> > If you read the proposal, there is no real breaking change. There will > >> be a > >> > switch to choose the existing metrics or the new ones. The dashboards > >> will > >> > be updated and provided. > >> > > >> > At the same time, the best sure way to motivate users to switch or not > >> > adopt a platform is to stick with confusing/inconsistent APIs/Metrics. > >> > > >> > > >> > -- > >> > Matteo Merli > >> > <matteo.me...@gmail.com> > >> > > >> > > >> > On Wed, Jun 14, 2023 at 6:10 PM Devin Bost <devin.b...@gmail.com> > >> wrote: > >> > > >> > > > Thanks for the details, Devin. Curios - 'We' stands for which > >> company? > >> > > > >> > > What do you mean? I was quoting Rajan when I said, "we." > >> > > > >> > > > >> > > Devin G. Bost > >> > > > >> > > > >> > > On Wed, Jun 14, 2023 at 10:02 AM Asaf Mesika <asaf.mes...@gmail.com> > >> > > wrote: > >> > > > >> > > > Thanks for the details, Devin. Curios - 'We' stands for which > >> company? > >> > > > > >> > > > Can you take a look at my previous response to see if it answers the > >> > > > concern you raised? > >> > > > > >> > > > Thanks! > >> > > > > >> > > > > >> > > > On Wed, Jun 14, 2023 at 1:49 PM Devin Bost <devin.b...@gmail.com> > >> wrote: > >> > > > > >> > > > > > Hi, > >> > > > > > > >> > > > > > Are we proposing a change to break existing metrics > >> compatibility > >> > > > > > (prometheus)? If that is the case then it's a big red flag as > >> it will > >> > > > be > >> > > > > a > >> > > > > > pain for any company to upgrade Pulsar as monitoring is THE most > >> > > > > important > >> > > > > > part of the system and we don't even want to break > >> compatibility for > >> > > > any > >> > > > > > small things to avoid interruption for users that are using > >> Pulsar > >> > > > > system. > >> > > > > > I think it's always good to enhance a system by maintaining > >> > > > compatibility > >> > > > > > and I would be fine if we can introduce new metrics API without > >> > > causing > >> > > > > ANY > >> > > > > > interruption to existing metrics API. But if we can't maintain > >> > > > > > compatibility then it's a big red flag and not acceptable for > >> the > >> > > > Pulsar > >> > > > > > community. > >> > > > > > >> > > > > Proposing a large breaking change (even if it's crucial) is the > >> single > >> > > > > fastest way to motivate your users to migrate to a different > >> platform. > >> > > I > >> > > > > wish it wasn't the case, but it's the cold reality. > >> > > > > > >> > > > > With that said, I'm a big proponent of Open Telemetry. I did a big > >> > > video > >> > > > a > >> > > > > while back that some of you may remember on the use of Open > >> Tracing > >> > > > (before > >> > > > > it was merged into Open Telemetry). Open Telemetry has gained > >> > > > considerable > >> > > > > momentum in the industry since then. > >> > > > > > >> > > > > I'm also very interested in a solution to the metrics problem. > >> I've run > >> > > > > into the scalability issues with metrics in production, and I've > >> been > >> > > > very > >> > > > > concerned about the metrics bottlenecks around our ability to > >> deliver > >> > > our > >> > > > > promises around supporting large numbers of topics. One of the big > >> > > > > advantages of Pulsar over Kafka is supposed to be that topics are > >> > > cheap, > >> > > > > but as it stands, our current metrics design gets seriously in > >> the way > >> > > of > >> > > > > that. Generally speaking, I'm open to solutions, especially if > >> they > >> > > align > >> > > > > us with a growing industry standard. > >> > > > > > >> > > > > - Devin > >> > > > > > >> > > > > > >> > > > > On Wed, Jun 14, 2023, 3:28 AM Enrico Olivelli < > >> eolive...@gmail.com> > >> > > > wrote: > >> > > > > > >> > > > > > Il Mer 14 Giu 2023, 04:33 Rajan Dhabalia <rdhaba...@apache.org> > >> ha > >> > > > > > scritto: > >> > > > > > > >> > > > > > > Hi, > >> > > > > > > > >> > > > > > > Are we proposing a change to break existing metrics > >> compatibility > >> > > > > > > (prometheus)? If that is the case then it's a big red flag as > >> it > >> > > will > >> > > > > be > >> > > > > > a > >> > > > > > > pain for any company to upgrade Pulsar as monitoring is THE > >> most > >> > > > > > important > >> > > > > > > part of the system and we don't even want to break > >> compatibility > >> > > for > >> > > > > any > >> > > > > > > small things to avoid interruption for users that are using > >> Pulsar > >> > > > > > system. > >> > > > > > > I think it's always good to enhance a system by maintaining > >> > > > > compatibility > >> > > > > > > and I would be fine if we can introduce new metrics API > >> without > >> > > > causing > >> > > > > > ANY > >> > > > > > > interruption to existing metrics API. But if we can't maintain > >> > > > > > > compatibility then it's a big red flag and not acceptable for > >> the > >> > > > > Pulsar > >> > > > > > > community. > >> > > > > > > > >> > > > > > > >> > > > > > I agree. > >> > > > > > > >> > > > > > If it is possible to export data Ina way that is compatible with > >> > > > > Prometheus > >> > > > > > without adding too much overhead then I would support this work. > >> > > > > > > >> > > > > > About renaming the metrics: we can do it only if tue changes for > >> > > users > >> > > > > are > >> > > > > > as trivial as replacing the queries in the grafana dashboard or > >> in > >> > > > > alerting > >> > > > > > systems. > >> > > > > > > >> > > > > > Asaf, do you have prototype? Built over any version of Pulsar? > >> > > > > > > >> > > > > > Also, it would be very useful to start an initiative to collect > >> the > >> > > > list > >> > > > > of > >> > > > > > metrics that people really use in production, especially for > >> > > automated > >> > > > > > alerts. > >> > > > > > > >> > > > > > In my experience you usually care about: > >> > > > > > - in/out traffic (rates, bytes...) > >> > > > > > - number of producer, consumers, topics, subscriptions... > >> > > > > > - backlog > >> > > > > > - jvm metrics > >> > > > > > - function custom metrics > >> > > > > > > >> > > > > > > >> > > > > > Enrico > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > Thanks, > >> > > > > > > Rajan > >> > > > > > > > >> > > > > > > On Sun, May 21, 2023 at 9:01 AM Asaf Mesika < > >> asaf.mes...@gmail.com > >> > > > > >> > > > > > wrote: > >> > > > > > > > >> > > > > > > > Thanks for the reply, Enrico. > >> > > > > > > > Completely agree. > >> > > > > > > > This made me realize my TL;DR wasn't talking about export. > >> > > > > > > > I added this to it: > >> > > > > > > > > >> > > > > > > > --- > >> > > > > > > > Pulsar OTel Metrics will support exporting as Prometheus > >> HTTP > >> > > > > endpoint > >> > > > > > > > (`/metrics` but different port) for backward compatibility > >> and > >> > > also > >> > > > > > OLTP, > >> > > > > > > > so you can push the metrics to OTel Collector and from > >> there ship > >> > > > it > >> > > > > to > >> > > > > > > any > >> > > > > > > > destination. > >> > > > > > > > --- > >> > > > > > > > > >> > > > > > > > OTel supports two kinds of exporter: Prometheus (HTTP) and > >> OTLP > >> > > > > (push). > >> > > > > > > > We'll just configure to use them. > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > On Mon, May 15, 2023 at 10:35 AM Enrico Olivelli < > >> > > > > eolive...@gmail.com> > >> > > > > > > > wrote: > >> > > > > > > > > >> > > > > > > > > Asaf, > >> > > > > > > > > thanks for contributing in this area. > >> > > > > > > > > Metrics are a fundamental feature of Pulsar. > >> > > > > > > > > > >> > > > > > > > > Currently I find it very awkward to maintain metrics, and > >> also > >> > > I > >> > > > > see > >> > > > > > > > > it as a problem to support only Prometheus. > >> > > > > > > > > > >> > > > > > > > > Regarding your proposal, IIRC in the past someone else > >> proposed > >> > > > to > >> > > > > > > > > support other metrics systems and they have been > >> suggested to > >> > > > use a > >> > > > > > > > > sidecar approach, > >> > > > > > > > > that is to add something next to Pulsar services that > >> served > >> > > the > >> > > > > > > > > metrics in the preferred format/way. > >> > > > > > > > > I find that the sidecar approach is too inefficient and I > >> am > >> > > not > >> > > > > > > > > proposing it (but I wanted to add this reference for the > >> > > benefit > >> > > > of > >> > > > > > > > > new people on the list). > >> > > > > > > > > > >> > > > > > > > > I wonder if it would be possible to keep compatibility > >> with the > >> > > > > > > > > current Prometheus based metrics. > >> > > > > > > > > Now Pulsar reached a point in which is is widely used by > >> many > >> > > > > > > > > companies and also with big clusters, > >> > > > > > > > > telling people that they have to rework all the > >> infrastructure > >> > > > > > related > >> > > > > > > > > to metrics because we don't support Prometheus anymore or > >> > > because > >> > > > > we > >> > > > > > > > > changed radically the way we publish metrics > >> > > > > > > > > It is a step that seems too hard from my point of view. > >> > > > > > > > > > >> > > > > > > > > Currently I believe that compatibility is more important > >> than > >> > > > > > > > > versatility, and if we want to introduce new (and far > >> better) > >> > > > > > features > >> > > > > > > > > we must take it into account. > >> > > > > > > > > > >> > > > > > > > > So my point is that I generally support the idea of > >> opening the > >> > > > way > >> > > > > > to > >> > > > > > > > > Open Telemetry, but we must have a way to not force all > >> of our > >> > > > > users > >> > > > > > > > > to throw away their alerting systems, dashboards and > >> know-how > >> > > in > >> > > > > > > > > troubleshooting Pulsar problems in production and dev > >> > > > > > > > > > >> > > > > > > > > Best regards > >> > > > > > > > > Enrico > >> > > > > > > > > > >> > > > > > > > > Il giorno lun 15 mag 2023 alle ore 02:17 Dave Fisher > >> > > > > > > > > <wave4d...@comcast.net> ha scritto: > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > On May 10, 2023, at 1:01 AM, Asaf Mesika < > >> > > > > asaf.mes...@gmail.com> > >> > > > > > > > > wrote: > >> > > > > > > > > > > > >> > > > > > > > > > > On Tue, May 9, 2023 at 11:29 PM Dave Fisher < > >> > > > w...@apache.org> > >> > > > > > > > wrote: > >> > > > > > > > > > > > >> > > > > > > > > > >> > >> > > > > > > > > > >> > >> > > > > > > > > > >>>> On May 8, 2023, at 2:49 AM, Asaf Mesika < > >> > > > > > asaf.mes...@gmail.com> > >> > > > > > > > > wrote: > >> > > > > > > > > > >>> > >> > > > > > > > > > >>> Your feedback made me realized I need to add "TL;DR" > >> > > > section, > >> > > > > > > > which I > >> > > > > > > > > > >> just > >> > > > > > > > > > >>> added. > >> > > > > > > > > > >>> > >> > > > > > > > > > >>> I'm quoting it here. It gives a brief summary of the > >> > > > > proposal, > >> > > > > > > > which > >> > > > > > > > > > >>> requires up to 5 min of read time, helping you get > >> a high > >> > > > > level > >> > > > > > > > > picture > >> > > > > > > > > > >>> before you dive into the > >> background/motivation/solution. > >> > > > > > > > > > >>> > >> > > > > > > > > > >>> ---------------------- > >> > > > > > > > > > >>> TL;DR > >> > > > > > > > > > >>> > >> > > > > > > > > > >>> Working with Metrics today as a user or a developer > >> is > >> > > hard > >> > > > > and > >> > > > > > > has > >> > > > > > > > > many > >> > > > > > > > > > >>> severe issues. > >> > > > > > > > > > >>> > >> > > > > > > > > > >>> From the user perspective: > >> > > > > > > > > > >>> > >> > > > > > > > > > >>> - One of Pulsar strongest feature is "cheap" > >> topics so > >> > > you > >> > > > > can > >> > > > > > > > > easily > >> > > > > > > > > > >>> have 10k - 100k topics per broker. Once you do > >> that, you > >> > > > > > quickly > >> > > > > > > > > learn > >> > > > > > > > > > >> that > >> > > > > > > > > > >>> the amount of metrics you export via "/metrics" > >> > > > (Prometheus > >> > > > > > > style > >> > > > > > > > > > >> endpoint) > >> > > > > > > > > > >>> becomes really big. The cost to store them becomes > >> too > >> > > > high, > >> > > > > > > > queries > >> > > > > > > > > > >>> time-out or even "/metrics" endpoint it self times > >> out. > >> > > > > > > > > > >>> The only option Pulsar gives you today is > >> all-or-nothing > >> > > > > > > filtering > >> > > > > > > > > and > >> > > > > > > > > > >>> very crude aggregation. You switch metrics from > >> topic > >> > > > > > > aggregation > >> > > > > > > > > > >> level to > >> > > > > > > > > > >>> namespace aggregation level. Also you can turn off > >> > > > producer > >> > > > > > and > >> > > > > > > > > > >> consumer > >> > > > > > > > > > >>> level metrics. You end up doing it all leaving you > >> > > > "blind", > >> > > > > > > > looking > >> > > > > > > > > at > >> > > > > > > > > > >> the > >> > > > > > > > > > >>> metrics from a namespace level which is too high > >> level. > >> > > > You > >> > > > > > end > >> > > > > > > up > >> > > > > > > > > > >>> conjuring all kinds of scripts on top of topic > >> stats > >> > > > > endpoint > >> > > > > > to > >> > > > > > > > > glue > >> > > > > > > > > > >> some > >> > > > > > > > > > >>> aggregated metrics view for the topics you need. > >> > > > > > > > > > >>> - Summaries (metric type giving you quantiles like > >> p95) > >> > > > > which > >> > > > > > > are > >> > > > > > > > > used > >> > > > > > > > > > >>> in Pulsar, can't be aggregated across topics / > >> brokers > >> > > due > >> > > > > its > >> > > > > > > > > inherent > >> > > > > > > > > > >>> design. > >> > > > > > > > > > >>> - Plugin authors spend too much time on defining > >> and > >> > > > > exposing > >> > > > > > > > > metrics > >> > > > > > > > > > >> to > >> > > > > > > > > > >>> Pulsar since the only interface Pulsar offers is > >> writing > >> > > > > your > >> > > > > > > > > metrics > >> > > > > > > > > > >> by > >> > > > > > > > > > >>> your self as UTF-8 bytes in Prometheus Text Format > >> to > >> > > byte > >> > > > > > > stream > >> > > > > > > > > > >> interface > >> > > > > > > > > > >>> given to you. > >> > > > > > > > > > >>> - Pulsar histograms are exported in a way that is > >> not > >> > > > > > conformant > >> > > > > > > > > with > >> > > > > > > > > > >>> Prometheus, which means you can't get the p95 > >> quantile > >> > > on > >> > > > > such > >> > > > > > > > > > >> histograms, > >> > > > > > > > > > >>> making them very hard to use in day to day life. > >> > > > > > > > > > >> > >> > > > > > > > > > >> What version of DataSketches is used to produce the > >> > > > histogram? > >> > > > > > Is > >> > > > > > > is > >> > > > > > > > > still > >> > > > > > > > > > >> an old Yahoo one, or are we using an updated one from > >> > > Apache > >> > > > > > > > > DataSketches? > >> > > > > > > > > > >> > >> > > > > > > > > > >> Seems like this is a single PR/small PIP for 3.1? > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > Histograms are a list of buckets, each is a counter. > >> > > > > > > > > > > Summary is a collection of values collected over a > >> time > >> > > > window, > >> > > > > > > which > >> > > > > > > > > at > >> > > > > > > > > > > the end you get a calculation of the quantiles of > >> those > >> > > > values: > >> > > > > > > p95, > >> > > > > > > > > p50, > >> > > > > > > > > > > and those are exported from Pulsar. > >> > > > > > > > > > > > >> > > > > > > > > > > Pulsar histogram do not use Data Sketches. > >> > > > > > > > > > > >> > > > > > > > > > Bookkeeper Metrics wraps Yahoo DataSketches last I > >> checked. > >> > > > > > > > > > > >> > > > > > > > > > > They are just counters. > >> > > > > > > > > > > They are not adhere to Prometheus since: > >> > > > > > > > > > > a. The counter is expected to be cumulative, but > >> Pulsar > >> > > > resets > >> > > > > > each > >> > > > > > > > > bucket > >> > > > > > > > > > > counter to 0 every 1 min > >> > > > > > > > > > > b. The bucket upper range is expected to be written > >> as an > >> > > > > > attribute > >> > > > > > > > > "le" > >> > > > > > > > > > > but today it is encoded in the name of the metric > >> itself. > >> > > > > > > > > > > > >> > > > > > > > > > > This is a breaking change, hence hard to mark in any > >> small > >> > > > > > release. > >> > > > > > > > > > > This is why it's part of this PIP since so many > >> things will > >> > > > > > break, > >> > > > > > > > and > >> > > > > > > > > all > >> > > > > > > > > > > of them will break on a separate layer (OTel metrics), > >> > > hence > >> > > > > not > >> > > > > > > > break > >> > > > > > > > > > > anyone without their consent. > >> > > > > > > > > > > >> > > > > > > > > > If this change will break existing Grafana dashboards > >> and > >> > > other > >> > > > > > > > > operational monitoring already in place then it will break > >> > > > > guarantees > >> > > > > > > we > >> > > > > > > > > have made about safely being able to downgrade from a bad > >> > > > upgrade. > >> > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > >> > > > > > > > > > >> > >> > > > > > > > > > >>> - Too many metrics are rates which also delta reset > >> > > every > >> > > > > > > interval > >> > > > > > > > > you > >> > > > > > > > > > >>> configure in Pulsar and restart, instead of > >> relying on > >> > > > > > > cumulative > >> > > > > > > > > (ever > >> > > > > > > > > > >>> growing) counters and let Prometheus use its rate > >> > > > function. > >> > > > > > > > > > >>> - and many more issues > >> > > > > > > > > > >>> > >> > > > > > > > > > >>> From the developer perspective: > >> > > > > > > > > > >>> > >> > > > > > > > > > >>> - There are 4 different ways to define and record > >> > > metrics > >> > > > in > >> > > > > > > > Pulsar: > >> > > > > > > > > > >>> Pulsar own metrics library, Prometheus Java Client, > >> > > > > Bookkeeper > >> > > > > > > > > metrics > >> > > > > > > > > > >>> library and plain native Java SDK objects > >> (AtomicLong, > >> > > > ...). > >> > > > > > > It's > >> > > > > > > > > very > >> > > > > > > > > > >>> confusing for the developer and create > >> inconsistencies > >> > > for > >> > > > > the > >> > > > > > > end > >> > > > > > > > > user > >> > > > > > > > > > >>> (e.g. Summary for example is different in each). > >> > > > > > > > > > >>> - Patching your metrics into "/metrics" Prometheus > >> > > > endpoint > >> > > > > is > >> > > > > > > > > > >>> confusing, cumbersome and error prone. > >> > > > > > > > > > >>> - many more > >> > > > > > > > > > >>> > >> > > > > > > > > > >>> This proposal offers several key changes to solve > >> that: > >> > > > > > > > > > >>> > >> > > > > > > > > > >>> - Cardinality (supporting 10k-100k topics per > >> broker) is > >> > > > > > solved > >> > > > > > > by > >> > > > > > > > > > >>> introducing a new aggregation level for metrics > >> called > >> > > > Topic > >> > > > > > > > Metric > >> > > > > > > > > > >> Group. > >> > > > > > > > > > >>> Using configuration, you specify for each topic its > >> > > group > >> > > > > > (using > >> > > > > > > > > > >>> wildcard/regex). This allows you to "zoom" out to > >> a more > >> > > > > > > detailed > >> > > > > > > > > > >>> granularity level like groups instead of > >> namespaces, > >> > > which > >> > > > > you > >> > > > > > > > > control > >> > > > > > > > > > >> how > >> > > > > > > > > > >>> many groups you'll have hence solving the > >> cardinality > >> > > > issue, > >> > > > > > > > without > >> > > > > > > > > > >>> sacrificing level of detail too much. > >> > > > > > > > > > >>> - Fine-grained filtering mechanism, dynamic. > >> You'll have > >> > > > > > > > rule-based > >> > > > > > > > > > >>> dynamic configuration, allowing you to specify per > >> > > > > > > > > > >> namespace/topic/group > >> > > > > > > > > > >>> which metrics you'd like to keep/drop. Rules > >> allows you > >> > > to > >> > > > > set > >> > > > > > > the > >> > > > > > > > > > >> default > >> > > > > > > > > > >>> to have small amount of metrics in group and > >> namespace > >> > > > level > >> > > > > > > only > >> > > > > > > > > and > >> > > > > > > > > > >> drop > >> > > > > > > > > > >>> the rest. When needed, you can add an override > >> rule to > >> > > > > "open" > >> > > > > > > up a > >> > > > > > > > > > >> certain > >> > > > > > > > > > >>> group to have more metrics in higher granularity > >> (topic > >> > > or > >> > > > > > even > >> > > > > > > > > > >>> consumer/producer level). Since it's dynamic you > >> "open" > >> > > > > such a > >> > > > > > > > group > >> > > > > > > > > > >> when > >> > > > > > > > > > >>> you see it's misbehaving, see it in topic level, > >> and > >> > > when > >> > > > > all > >> > > > > > > > > > >> resolved, you > >> > > > > > > > > > >>> can "close" it. A bit similar experience to logging > >> > > levels > >> > > > > in > >> > > > > > > > Log4j > >> > > > > > > > > or > >> > > > > > > > > > >>> Logback, that you default and override per > >> > > class/package. > >> > > > > > > > > > >>> > >> > > > > > > > > > >>> Aggregation and Filtering combined solves the > >> cardinality > >> > > > > > without > >> > > > > > > > > > >>> sacrificing the level of detail when needed and most > >> > > > > > importantly, > >> > > > > > > > you > >> > > > > > > > > > >>> determine which topic/group/namespace it happens on. > >> > > > > > > > > > >>> > >> > > > > > > > > > >>> Since this change is so invasive, it requires a > >> single > >> > > > > metrics > >> > > > > > > > > library to > >> > > > > > > > > > >>> implement all of it on top of; Hence the third big > >> change > >> > > > > point > >> > > > > > > is > >> > > > > > > > > > >>> consolidating all four ways to define and record > >> metrics > >> > > > to a > >> > > > > > > > single > >> > > > > > > > > > >> one, a > >> > > > > > > > > > >>> new one: OpenTelemtry Metrics (Java SDK, and also > >> Python > >> > > > and > >> > > > > Go > >> > > > > > > for > >> > > > > > > > > the > >> > > > > > > > > > >>> Pulsar Function runners). > >> > > > > > > > > > >>> Introducing OpenTelemetry (OTel) solves also the > >> biggest > >> > > > pain > >> > > > > > > point > >> > > > > > > > > from > >> > > > > > > > > > >>> the developer perspective, since it's a superb > >> metrics > >> > > > > library > >> > > > > > > > > offering > >> > > > > > > > > > >>> everything you need, and there is going to be a > >> single > >> > > way > >> > > > - > >> > > > > > only > >> > > > > > > > it. > >> > > > > > > > > > >> Also, > >> > > > > > > > > > >>> it solves the robustness for Plugin author which > >> will use > >> > > > > > > > > OpenTelemetry. > >> > > > > > > > > > >> It > >> > > > > > > > > > >>> so happens that it also solves all the numerous > >> problems > >> > > > > > > described > >> > > > > > > > > in the > >> > > > > > > > > > >>> doc itself. > >> > > > > > > > > > >>> > >> > > > > > > > > > >>> The solution will be introduced as another layer > >> with > >> > > > feature > >> > > > > > > > > toggles, so > >> > > > > > > > > > >>> you can work with existing system, and/or OTel, > >> until > >> > > > > gradually > >> > > > > > > > > > >> deprecating > >> > > > > > > > > > >>> existing system. > >> > > > > > > > > > >>> > >> > > > > > > > > > >>> It's a big breaking change for Pulsar users on many > >> > > fronts: > >> > > > > > > names, > >> > > > > > > > > > >>> semantics, configuration. Read at the end of this > >> doc to > >> > > > > learn > >> > > > > > > > > exactly > >> > > > > > > > > > >> what > >> > > > > > > > > > >>> will change for the user (in high level). > >> > > > > > > > > > >>> > >> > > > > > > > > > >>> In my opinion, it will make Pulsar user experience > >> so > >> > > much > >> > > > > > > better, > >> > > > > > > > > they > >> > > > > > > > > > >>> will want to migrate to it, despite the breaking > >> change. > >> > > > > > > > > > >>> > >> > > > > > > > > > >>> This was a very short summary. You are most > >> welcomed to > >> > > > read > >> > > > > > the > >> > > > > > > > full > >> > > > > > > > > > >>> design document below and express feedback, so we > >> can > >> > > make > >> > > > it > >> > > > > > > > better. > >> > > > > > > > > > >>> > >> > > > > > > > > > >>> On Sun, May 7, 2023 at 7:52 PM Asaf Mesika < > >> > > > > > > asaf.mes...@gmail.com> > >> > > > > > > > > > >> wrote: > >> > > > > > > > > > >>> > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>> On Sun, May 7, 2023 at 4:23 PM Yunze Xu > >> > > > > > > > > <y...@streamnative.io.invalid> > >> > > > > > > > > > >>>> wrote: > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>>> I'm excited to learn much more about metrics when > >> I > >> > > > started > >> > > > > > > > reading > >> > > > > > > > > > >>>>> this proposal. But I became more and more > >> frustrated > >> > > > when I > >> > > > > > > found > >> > > > > > > > > > >>>>> there is still too much content left even if I've > >> > > already > >> > > > > > spent > >> > > > > > > > > much > >> > > > > > > > > > >>>>> time reading this proposal. I'm wondering how > >> much time > >> > > > did > >> > > > > > you > >> > > > > > > > > expect > >> > > > > > > > > > >>>>> reviewers to read through this proposal? I just > >> > > recalled > >> > > > > the > >> > > > > > > > > > >>>>> discussion you started before [1]. Did you expect > >> each > >> > > > PMC > >> > > > > > > member > >> > > > > > > > > that > >> > > > > > > > > > >>>>> gives his/her +1 to read only parts of this > >> proposal? > >> > > > > > > > > > >>>>> > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>> I estimated around 2 hours needed for a reviewer. > >> > > > > > > > > > >>>> I hate it being so long, but I simply couldn't > >> find a > >> > > way > >> > > > to > >> > > > > > > > > downsize it > >> > > > > > > > > > >>>> more. Furthermore, I consulted with my colleagues > >> > > > including > >> > > > > > > > Matteo, > >> > > > > > > > > but > >> > > > > > > > > > >> we > >> > > > > > > > > > >>>> couldn't see a way to scope it down. > >> > > > > > > > > > >>>> Why? Because once you begin this journey, you need > >> to > >> > > know > >> > > > > how > >> > > > > > > > it's > >> > > > > > > > > > >> going > >> > > > > > > > > > >>>> to end. > >> > > > > > > > > > >>>> What I ended up doing, is writing all the crucial > >> > > details > >> > > > > for > >> > > > > > > > > review in > >> > > > > > > > > > >>>> the High Level Design section. > >> > > > > > > > > > >>>> It's still a big, hefty section, but I don't think > >> I can > >> > > > > step > >> > > > > > > out > >> > > > > > > > > or let > >> > > > > > > > > > >>>> anyone else change Pulsar so invasively without > >> the full > >> > > > > > extent > >> > > > > > > of > >> > > > > > > > > the > >> > > > > > > > > > >>>> change. > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>> I don't think it's wise to read parts. > >> > > > > > > > > > >>>> I did my very best effort to minimize it, but the > >> scope > >> > > is > >> > > > > > > simply > >> > > > > > > > > big. > >> > > > > > > > > > >>>> Open for suggestions, but it requires reading all > >> the > >> > > PIP > >> > > > :) > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>> Thanks a lot Yunze for dedicating any time to it. > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>>> > >> > > > > > > > > > >>>>> Let's talk back to the proposal, for now, what I > >> mainly > >> > > > > > learned > >> > > > > > > > and > >> > > > > > > > > > >>>>> are concerned about mostly are: > >> > > > > > > > > > >>>>> 1. Pulsar has many ways to expose metrics. It's > >> not > >> > > > unified > >> > > > > > and > >> > > > > > > > > > >> confusing. > >> > > > > > > > > > >>>>> 2. The current metrics system cannot support a > >> large > >> > > > amount > >> > > > > > of > >> > > > > > > > > topics. > >> > > > > > > > > > >>>>> 3. It's hard for plugin authors to integrate > >> metrics. > >> > > > (For > >> > > > > > > > example, > >> > > > > > > > > > >>>>> KoP [2] integrates metrics by implementing the > >> > > > > > > > > > >>>>> PrometheusRawMetricsProvider interface and it > >> indeed > >> > > > needs > >> > > > > > much > >> > > > > > > > > work) > >> > > > > > > > > > >>>>> > >> > > > > > > > > > >>>>> Regarding the 1st issue, this proposal chooses > >> > > > > OpenTelemetry > >> > > > > > > > > (OTel). > >> > > > > > > > > > >>>>> > >> > > > > > > > > > >>>>> Regarding the 2nd issue, I scrolled to the "Why > >> > > > > > OpenTelemetry?" > >> > > > > > > > > > >>>>> section. It's still frustrating to see no answer. > >> > > > > > Eventually, I > >> > > > > > > > > found > >> > > > > > > > > > >>>>> > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>> OpenTelemetry isn't the solution for large amount > >> of > >> > > > topic. > >> > > > > > > > > > >>>> The solution is described at > >> > > > > > > > > > >>>> "Aggregate and Filtering to solve cardinality > >> issues" > >> > > > > section. > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>>> the explanation in the "What we need to fix in > >> > > > > OpenTelemetry > >> > > > > > - > >> > > > > > > > > > >>>>> Performance" section. It seems that we still need > >> some > >> > > > > > > > > enhancements in > >> > > > > > > > > > >>>>> OTel. In other words, currently OTel is not ready > >> for > >> > > > > > resolving > >> > > > > > > > all > >> > > > > > > > > > >>>>> these issues listed in the proposal but we > >> believe it > >> > > > will. > >> > > > > > > > > > >>>>> > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>> Let me rephrase "believe" --> we work together > >> with the > >> > > > > > > > maintainers > >> > > > > > > > > to > >> > > > > > > > > > >> do > >> > > > > > > > > > >>>> it, yes. > >> > > > > > > > > > >>>> I am open for any other suggestion. > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>>> > >> > > > > > > > > > >>>>> As for the 3rd issue, from the "Integrating with > >> Pulsar > >> > > > > > > Plugins" > >> > > > > > > > > > >>>>> section, the plugin authors still need to > >> implement the > >> > > > new > >> > > > > > > OTel > >> > > > > > > > > > >>>>> interfaces. Is it much easier than using the > >> existing > >> > > > ways > >> > > > > to > >> > > > > > > > > expose > >> > > > > > > > > > >>>>> metrics? Could metrics still be easily integrated > >> with > >> > > > > > Grafana? > >> > > > > > > > > > >>>>> > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>> Yes, it's way easier. > >> > > > > > > > > > >>>> Basically you have a full fledged metrics library > >> > > objects: > >> > > > > > > Meter, > >> > > > > > > > > Gauge, > >> > > > > > > > > > >>>> Histogram, Counter. > >> > > > > > > > > > >>>> No more Raw Metrics Provider, writing UTF-8 bytes > >> in > >> > > > > > Prometheus > >> > > > > > > > > format. > >> > > > > > > > > > >>>> You get namespacing for free with Meter name and > >> > > version. > >> > > > > > > > > > >>>> It's way better than current solution and any other > >> > > > library. > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>>> > >> > > > > > > > > > >>>>> That's all I am concerned about at the moment. I > >> > > > > understand, > >> > > > > > > and > >> > > > > > > > > > >>>>> appreciate that you've spent much time studying > >> and > >> > > > > > explaining > >> > > > > > > > all > >> > > > > > > > > > >>>>> these things. But, this proposal is still too > >> huge. > >> > > > > > > > > > >>>>> > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>> I appreciate your effort a lot! > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>> > >> > > > > > > > > > >>>>> > >> > > > > > > > > > >>>>> [1] > >> > > > > > > > > > >> > > https://lists.apache.org/thread/04jxqskcwwzdyfghkv4zstxxmzn154kf > >> > > > > > > > > > >>>>> [2] > >> > > > > > > > > > >>>>> > >> > > > > > > > > > >> > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> https://github.com/streamnative/kop/blob/master/kafka-impl/src/main/java/io/streamnative/pulsar/handlers/kop/stats/PrometheusMetricsProvider.java > >> > > > > > > > > > >>>>> > >> > > > > > > > > > >>>>> Thanks, > >> > > > > > > > > > >>>>> Yunze > >> > > > > > > > > > >>>>> > >> > > > > > > > > > >>>>> On Sun, May 7, 2023 at 5:53 PM Asaf Mesika < > >> > > > > > > > asaf.mes...@gmail.com> > >> > > > > > > > > > >> wrote: > >> > > > > > > > > > >>>>>> > >> > > > > > > > > > >>>>>> I'm very appreciative for feedback from multiple > >> > > pulsar > >> > > > > > users > >> > > > > > > > and > >> > > > > > > > > devs > >> > > > > > > > > > >>>>> on > >> > > > > > > > > > >>>>>> this PIP, since it has dramatic changes > >> suggested and > >> > > > > quite > >> > > > > > > > > extensive > >> > > > > > > > > > >>>>>> positive change for the users. > >> > > > > > > > > > >>>>>> > >> > > > > > > > > > >>>>>> > >> > > > > > > > > > >>>>>> On Thu, Apr 27, 2023 at 7:32 PM Asaf Mesika < > >> > > > > > > > > asaf.mes...@gmail.com> > >> > > > > > > > > > >>>>> wrote: > >> > > > > > > > > > >>>>>> > >> > > > > > > > > > >>>>>>> Hi all, > >> > > > > > > > > > >>>>>>> > >> > > > > > > > > > >>>>>>> I'm very excited to release a PIP I've been > >> working > >> > > on > >> > > > in > >> > > > > > the > >> > > > > > > > > past 11 > >> > > > > > > > > > >>>>>>> months, which I think will be immensely > >> valuable to > >> > > > > Pulsar, > >> > > > > > > > > which I > >> > > > > > > > > > >>>>> like so > >> > > > > > > > > > >>>>>>> much. > >> > > > > > > > > > >>>>>>> > >> > > > > > > > > > >>>>>>> PIP: > >> https://github.com/apache/pulsar/issues/20197 > >> > > > > > > > > > >>>>>>> > >> > > > > > > > > > >>>>>>> I'm quoting here the preface: > >> > > > > > > > > > >>>>>>> > >> > > > > > > > > > >>>>>>> === QUOTE START === > >> > > > > > > > > > >>>>>>> > >> > > > > > > > > > >>>>>>> Roughly 11 months ago, I started working on > >> solving > >> > > the > >> > > > > > > biggest > >> > > > > > > > > issue > >> > > > > > > > > > >>>>> with > >> > > > > > > > > > >>>>>>> Pulsar metrics: the lack of ability to monitor a > >> > > pulsar > >> > > > > > > broker > >> > > > > > > > > with a > >> > > > > > > > > > >>>>> large > >> > > > > > > > > > >>>>>>> topic count: 10k, 100k, and future support of > >> 1M. > >> > > This > >> > > > > > > started > >> > > > > > > > by > >> > > > > > > > > > >>>>> mapping > >> > > > > > > > > > >>>>>>> the existing functionality and then enumerating > >> all > >> > > the > >> > > > > > > > problems > >> > > > > > > > > I > >> > > > > > > > > > >>>>> saw (all > >> > > > > > > > > > >>>>>>> documented in this doc > >> > > > > > > > > > >>>>>>> < > >> > > > > > > > > > >>>>> > >> > > > > > > > > > >> > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> https://docs.google.com/document/d/1vke4w1nt7EEgOvEerPEUS-Al3aqLTm9cl2wTBkKNXUA/edit?usp=sharing > >> > > > > > > > > > > >> > > > > > > > > > I thought we were going to stop using Google docs for > >> PIPs. > >> > > > > > > > > > > >> > > > > > > > > > >>>>>> > >> > > > > > > > > > >>>>>>> ). > >> > > > > > > > > > >>>>>>> > >> > > > > > > > > > >>>>>>> This PIP is a parent PIP. It aims to gradually > >> solve > >> > > > > (using > >> > > > > > > > > sub-PIPs) > >> > > > > > > > > > >>>>> all > >> > > > > > > > > > >>>>>>> the current metric system's problems and > >> provide the > >> > > > > > ability > >> > > > > > > to > >> > > > > > > > > > >>>>> monitor a > >> > > > > > > > > > >>>>>>> broker with a large topic count, which is > >> currently > >> > > > > > lacking. > >> > > > > > > > As a > >> > > > > > > > > > >>>>> parent > >> > > > > > > > > > >>>>>>> PIP, it will describe each problem and its > >> solution > >> > > at > >> > > > a > >> > > > > > high > >> > > > > > > > > level, > >> > > > > > > > > > >>>>>>> leaving fine-grained details to the sub-PIPs. > >> The > >> > > > parent > >> > > > > > PIP > >> > > > > > > > > ensures > >> > > > > > > > > > >>>>> all > >> > > > > > > > > > >>>>>>> solutions align and does not contradict each > >> other. > >> > > > > > > > > > >>>>>>> > >> > > > > > > > > > >>>>>>> The basic building block to solve the monitoring > >> > > > ability > >> > > > > of > >> > > > > > > > large > >> > > > > > > > > > >>>>> topic > >> > > > > > > > > > >>>>>>> count is aggregating internally (to topic > >> groups) and > >> > > > > > adding > >> > > > > > > > > > >>>>> fine-grained > >> > > > > > > > > > >>>>>>> filtering. We could have shoe-horned it into the > >> > > > existing > >> > > > > > > > metric > >> > > > > > > > > > >>>>> system, > >> > > > > > > > > > >>>>>>> but we thought adding that to a system already > >> > > > ingrained > >> > > > > > with > >> > > > > > > > > many > >> > > > > > > > > > >>>>> problems > >> > > > > > > > > > >>>>>>> would be wrong and hard to do gradually, as so > >> many > >> > > > > things > >> > > > > > > will > >> > > > > > > > > > >>>>> break. This > >> > > > > > > > > > >>>>>>> is why the second-biggest design decision > >> presented > >> > > > here > >> > > > > is > >> > > > > > > > > > >>>>> consolidating > >> > > > > > > > > > >>>>>>> all existing metric libraries into a single one > >> - > >> > > > > > > OpenTelemetry > >> > > > > > > > > > >>>>>>> <https://opentelemetry.io/>. The parent PIP > >> will > >> > > > explain > >> > > > > > why > >> > > > > > > > > > >>>>>>> OpenTelemetry was chosen out of existing > >> solutions > >> > > and > >> > > > > why > >> > > > > > it > >> > > > > > > > far > >> > > > > > > > > > >>>>> exceeds > >> > > > > > > > > > >>>>>>> all other options. I’ve been working closely > >> with the > >> > > > > > > > > OpenTelemetry > >> > > > > > > > > > >>>>>>> community in the past eight months: > >> brain-storming > >> > > this > >> > > > > > > > > integration, > >> > > > > > > > > > >>>>> and > >> > > > > > > > > > >>>>>>> raising issues, in an effort to remove serious > >> > > blockers > >> > > > > to > >> > > > > > > make > >> > > > > > > > > this > >> > > > > > > > > > >>>>>>> migration successful. > >> > > > > > > > > > >>>>>>> > >> > > > > > > > > > >>>>>>> I made every effort to summarize this document > >> so > >> > > that > >> > > > it > >> > > > > > can > >> > > > > > > > be > >> > > > > > > > > > >>>>> concise > >> > > > > > > > > > >>>>>>> yet clear. I understand it is an effort to read > >> it > >> > > and, > >> > > > > > more > >> > > > > > > > so, > >> > > > > > > > > > >>>>> provide > >> > > > > > > > > > >>>>>>> meaningful feedback on such a large document; > >> hence > >> > > I’m > >> > > > > > very > >> > > > > > > > > grateful > >> > > > > > > > > > >>>>> for > >> > > > > > > > > > >>>>>>> each individual who does so. > >> > > > > > > > > > >>>>>>> > >> > > > > > > > > > >>>>>>> I think this design will help improve the user > >> > > > experience > >> > > > > > > > > immensely, > >> > > > > > > > > > >>>>> so it > >> > > > > > > > > > >>>>>>> is worth the time spent reading it. > >> > > > > > > > > > >>>>>>> > >> > > > > > > > > > >>>>>>> > >> > > > > > > > > > >>>>>>> === QUOTE END === > >> > > > > > > > > > >>>>>>> > >> > > > > > > > > > >>>>>>> > >> > > > > > > > > > >>>>>>> Thanks! > >> > > > > > > > > > >>>>>>> > >> > > > > > > > > > >>>>>>> Asaf Mesika > >> > > > > > > > > > >>>>>>> > >> > > > > > > > > > >>>>> > >> > > > > > > > > > >>>> > >> > > > > > > > > > >> > >> > > > > > > > > > >> > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > >