On Wed, Jun 14, 2023 at 11:28 AM Enrico Olivelli <eolive...@gmail.com> wrote:
> Il Mer 14 Giu 2023, 04:33 Rajan Dhabalia <rdhaba...@apache.org> ha > scritto: > > > Hi, > > > > Are we proposing a change to break existing metrics compatibility > > (prometheus)? If that is the case then it's a big red flag as it will be > a > > pain for any company to upgrade Pulsar as monitoring is THE most > important > > part of the system and we don't even want to break compatibility for any > > small things to avoid interruption for users that are using Pulsar > system. > > I think it's always good to enhance a system by maintaining compatibility > > and I would be fine if we can introduce new metrics API without causing > ANY > > interruption to existing metrics API. But if we can't maintain > > compatibility then it's a big red flag and not acceptable for the Pulsar > > community. > > > > I agree. > > If it is possible to export data Ina way that is compatible with Prometheus > without adding too much overhead then I would support this work. > > About renaming the metrics: we can do it only if tue changes for users are > as trivial as replacing the queries in the grafana dashboard or in alerting > systems. > Can you look at the answer I gave in the prior response and see if it answers your comments? > > Asaf, do you have prototype? Built over any version of Pulsar? > No. The idea of the parent PIP is to agree on the solution's direction before I start. We're talking about a year of work. > > Also, it would be very useful to start an initiative to collect the list of > metrics that people really use in production, especially for automated > alerts. > > In my experience you usually care about: > - in/out traffic (rates, bytes...) > - number of producer, consumers, topics, subscriptions... > - backlog > - jvm metrics > - function custom metrics > > I am trying to understand. Do you mean I can use it as default exposed metrics since introducing the new filter mechanism? > > Enrico > > > > > > Thanks, > > Rajan > > > > On Sun, May 21, 2023 at 9:01 AM Asaf Mesika <asaf.mes...@gmail.com> > wrote: > > > > > Thanks for the reply, Enrico. > > > Completely agree. > > > This made me realize my TL;DR wasn't talking about export. > > > I added this to it: > > > > > > --- > > > Pulsar OTel Metrics will support exporting as Prometheus HTTP endpoint > > > (`/metrics` but different port) for backward compatibility and also > OLTP, > > > so you can push the metrics to OTel Collector and from there ship it to > > any > > > destination. > > > --- > > > > > > OTel supports two kinds of exporter: Prometheus (HTTP) and OTLP (push). > > > We'll just configure to use them. > > > > > > > > > > > > On Mon, May 15, 2023 at 10:35 AM Enrico Olivelli <eolive...@gmail.com> > > > wrote: > > > > > > > Asaf, > > > > thanks for contributing in this area. > > > > Metrics are a fundamental feature of Pulsar. > > > > > > > > Currently I find it very awkward to maintain metrics, and also I see > > > > it as a problem to support only Prometheus. > > > > > > > > Regarding your proposal, IIRC in the past someone else proposed to > > > > support other metrics systems and they have been suggested to use a > > > > sidecar approach, > > > > that is to add something next to Pulsar services that served the > > > > metrics in the preferred format/way. > > > > I find that the sidecar approach is too inefficient and I am not > > > > proposing it (but I wanted to add this reference for the benefit of > > > > new people on the list). > > > > > > > > I wonder if it would be possible to keep compatibility with the > > > > current Prometheus based metrics. > > > > Now Pulsar reached a point in which is is widely used by many > > > > companies and also with big clusters, > > > > telling people that they have to rework all the infrastructure > related > > > > to metrics because we don't support Prometheus anymore or because we > > > > changed radically the way we publish metrics > > > > It is a step that seems too hard from my point of view. > > > > > > > > Currently I believe that compatibility is more important than > > > > versatility, and if we want to introduce new (and far better) > features > > > > we must take it into account. > > > > > > > > So my point is that I generally support the idea of opening the way > to > > > > Open Telemetry, but we must have a way to not force all of our users > > > > to throw away their alerting systems, dashboards and know-how in > > > > troubleshooting Pulsar problems in production and dev > > > > > > > > Best regards > > > > Enrico > > > > > > > > Il giorno lun 15 mag 2023 alle ore 02:17 Dave Fisher > > > > <wave4d...@comcast.net> ha scritto: > > > > > > > > > > > > > > > > > > > > > On May 10, 2023, at 1:01 AM, Asaf Mesika <asaf.mes...@gmail.com> > > > > wrote: > > > > > > > > > > > > On Tue, May 9, 2023 at 11:29 PM Dave Fisher <w...@apache.org> > > > wrote: > > > > > > > > > > > >> > > > > > >> > > > > > >>>> On May 8, 2023, at 2:49 AM, Asaf Mesika < > asaf.mes...@gmail.com> > > > > wrote: > > > > > >>> > > > > > >>> Your feedback made me realized I need to add "TL;DR" section, > > > which I > > > > > >> just > > > > > >>> added. > > > > > >>> > > > > > >>> I'm quoting it here. It gives a brief summary of the proposal, > > > which > > > > > >>> requires up to 5 min of read time, helping you get a high level > > > > picture > > > > > >>> before you dive into the background/motivation/solution. > > > > > >>> > > > > > >>> ---------------------- > > > > > >>> TL;DR > > > > > >>> > > > > > >>> Working with Metrics today as a user or a developer is hard and > > has > > > > many > > > > > >>> severe issues. > > > > > >>> > > > > > >>> From the user perspective: > > > > > >>> > > > > > >>> - One of Pulsar strongest feature is "cheap" topics so you can > > > > easily > > > > > >>> have 10k - 100k topics per broker. Once you do that, you > quickly > > > > learn > > > > > >> that > > > > > >>> the amount of metrics you export via "/metrics" (Prometheus > > style > > > > > >> endpoint) > > > > > >>> becomes really big. The cost to store them becomes too high, > > > queries > > > > > >>> time-out or even "/metrics" endpoint it self times out. > > > > > >>> The only option Pulsar gives you today is all-or-nothing > > filtering > > > > and > > > > > >>> very crude aggregation. You switch metrics from topic > > aggregation > > > > > >> level to > > > > > >>> namespace aggregation level. Also you can turn off producer > and > > > > > >> consumer > > > > > >>> level metrics. You end up doing it all leaving you "blind", > > > looking > > > > at > > > > > >> the > > > > > >>> metrics from a namespace level which is too high level. You > end > > up > > > > > >>> conjuring all kinds of scripts on top of topic stats endpoint > to > > > > glue > > > > > >> some > > > > > >>> aggregated metrics view for the topics you need. > > > > > >>> - Summaries (metric type giving you quantiles like p95) which > > are > > > > used > > > > > >>> in Pulsar, can't be aggregated across topics / brokers due its > > > > inherent > > > > > >>> design. > > > > > >>> - Plugin authors spend too much time on defining and exposing > > > > metrics > > > > > >> to > > > > > >>> Pulsar since the only interface Pulsar offers is writing your > > > > metrics > > > > > >> by > > > > > >>> your self as UTF-8 bytes in Prometheus Text Format to byte > > stream > > > > > >> interface > > > > > >>> given to you. > > > > > >>> - Pulsar histograms are exported in a way that is not > conformant > > > > with > > > > > >>> Prometheus, which means you can't get the p95 quantile on such > > > > > >> histograms, > > > > > >>> making them very hard to use in day to day life. > > > > > >> > > > > > >> What version of DataSketches is used to produce the histogram? > Is > > is > > > > still > > > > > >> an old Yahoo one, or are we using an updated one from Apache > > > > DataSketches? > > > > > >> > > > > > >> Seems like this is a single PR/small PIP for 3.1? > > > > > > > > > > > > > > > > > > Histograms are a list of buckets, each is a counter. > > > > > > Summary is a collection of values collected over a time window, > > which > > > > at > > > > > > the end you get a calculation of the quantiles of those values: > > p95, > > > > p50, > > > > > > and those are exported from Pulsar. > > > > > > > > > > > > Pulsar histogram do not use Data Sketches. > > > > > > > > > > Bookkeeper Metrics wraps Yahoo DataSketches last I checked. > > > > > > > > > > > They are just counters. > > > > > > They are not adhere to Prometheus since: > > > > > > a. The counter is expected to be cumulative, but Pulsar resets > each > > > > bucket > > > > > > counter to 0 every 1 min > > > > > > b. The bucket upper range is expected to be written as an > attribute > > > > "le" > > > > > > but today it is encoded in the name of the metric itself. > > > > > > > > > > > > This is a breaking change, hence hard to mark in any small > release. > > > > > > This is why it's part of this PIP since so many things will > break, > > > and > > > > all > > > > > > of them will break on a separate layer (OTel metrics), hence not > > > break > > > > > > anyone without their consent. > > > > > > > > > > If this change will break existing Grafana dashboards and other > > > > operational monitoring already in place then it will break guarantees > > we > > > > have made about safely being able to downgrade from a bad upgrade. > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > >> > > > > > >>> - Too many metrics are rates which also delta reset every > > interval > > > > you > > > > > >>> configure in Pulsar and restart, instead of relying on > > cumulative > > > > (ever > > > > > >>> growing) counters and let Prometheus use its rate function. > > > > > >>> - and many more issues > > > > > >>> > > > > > >>> From the developer perspective: > > > > > >>> > > > > > >>> - There are 4 different ways to define and record metrics in > > > Pulsar: > > > > > >>> Pulsar own metrics library, Prometheus Java Client, Bookkeeper > > > > metrics > > > > > >>> library and plain native Java SDK objects (AtomicLong, ...). > > It's > > > > very > > > > > >>> confusing for the developer and create inconsistencies for the > > end > > > > user > > > > > >>> (e.g. Summary for example is different in each). > > > > > >>> - Patching your metrics into "/metrics" Prometheus endpoint is > > > > > >>> confusing, cumbersome and error prone. > > > > > >>> - many more > > > > > >>> > > > > > >>> This proposal offers several key changes to solve that: > > > > > >>> > > > > > >>> - Cardinality (supporting 10k-100k topics per broker) is > solved > > by > > > > > >>> introducing a new aggregation level for metrics called Topic > > > Metric > > > > > >> Group. > > > > > >>> Using configuration, you specify for each topic its group > (using > > > > > >>> wildcard/regex). This allows you to "zoom" out to a more > > detailed > > > > > >>> granularity level like groups instead of namespaces, which you > > > > control > > > > > >> how > > > > > >>> many groups you'll have hence solving the cardinality issue, > > > without > > > > > >>> sacrificing level of detail too much. > > > > > >>> - Fine-grained filtering mechanism, dynamic. You'll have > > > rule-based > > > > > >>> dynamic configuration, allowing you to specify per > > > > > >> namespace/topic/group > > > > > >>> which metrics you'd like to keep/drop. Rules allows you to set > > the > > > > > >> default > > > > > >>> to have small amount of metrics in group and namespace level > > only > > > > and > > > > > >> drop > > > > > >>> the rest. When needed, you can add an override rule to "open" > > up a > > > > > >> certain > > > > > >>> group to have more metrics in higher granularity (topic or > even > > > > > >>> consumer/producer level). Since it's dynamic you "open" such a > > > group > > > > > >> when > > > > > >>> you see it's misbehaving, see it in topic level, and when all > > > > > >> resolved, you > > > > > >>> can "close" it. A bit similar experience to logging levels in > > > Log4j > > > > or > > > > > >>> Logback, that you default and override per class/package. > > > > > >>> > > > > > >>> Aggregation and Filtering combined solves the cardinality > without > > > > > >>> sacrificing the level of detail when needed and most > importantly, > > > you > > > > > >>> determine which topic/group/namespace it happens on. > > > > > >>> > > > > > >>> Since this change is so invasive, it requires a single metrics > > > > library to > > > > > >>> implement all of it on top of; Hence the third big change point > > is > > > > > >>> consolidating all four ways to define and record metrics to a > > > single > > > > > >> one, a > > > > > >>> new one: OpenTelemtry Metrics (Java SDK, and also Python and Go > > for > > > > the > > > > > >>> Pulsar Function runners). > > > > > >>> Introducing OpenTelemetry (OTel) solves also the biggest pain > > point > > > > from > > > > > >>> the developer perspective, since it's a superb metrics library > > > > offering > > > > > >>> everything you need, and there is going to be a single way - > only > > > it. > > > > > >> Also, > > > > > >>> it solves the robustness for Plugin author which will use > > > > OpenTelemetry. > > > > > >> It > > > > > >>> so happens that it also solves all the numerous problems > > described > > > > in the > > > > > >>> doc itself. > > > > > >>> > > > > > >>> The solution will be introduced as another layer with feature > > > > toggles, so > > > > > >>> you can work with existing system, and/or OTel, until gradually > > > > > >> deprecating > > > > > >>> existing system. > > > > > >>> > > > > > >>> It's a big breaking change for Pulsar users on many fronts: > > names, > > > > > >>> semantics, configuration. Read at the end of this doc to learn > > > > exactly > > > > > >> what > > > > > >>> will change for the user (in high level). > > > > > >>> > > > > > >>> In my opinion, it will make Pulsar user experience so much > > better, > > > > they > > > > > >>> will want to migrate to it, despite the breaking change. > > > > > >>> > > > > > >>> This was a very short summary. You are most welcomed to read > the > > > full > > > > > >>> design document below and express feedback, so we can make it > > > better. > > > > > >>> > > > > > >>> On Sun, May 7, 2023 at 7:52 PM Asaf Mesika < > > asaf.mes...@gmail.com> > > > > > >> wrote: > > > > > >>> > > > > > >>>> > > > > > >>>> > > > > > >>>> On Sun, May 7, 2023 at 4:23 PM Yunze Xu > > > > <y...@streamnative.io.invalid> > > > > > >>>> wrote: > > > > > >>>> > > > > > >>>>> I'm excited to learn much more about metrics when I started > > > reading > > > > > >>>>> this proposal. But I became more and more frustrated when I > > found > > > > > >>>>> there is still too much content left even if I've already > spent > > > > much > > > > > >>>>> time reading this proposal. I'm wondering how much time did > you > > > > expect > > > > > >>>>> reviewers to read through this proposal? I just recalled the > > > > > >>>>> discussion you started before [1]. Did you expect each PMC > > member > > > > that > > > > > >>>>> gives his/her +1 to read only parts of this proposal? > > > > > >>>>> > > > > > >>>> > > > > > >>>> I estimated around 2 hours needed for a reviewer. > > > > > >>>> I hate it being so long, but I simply couldn't find a way to > > > > downsize it > > > > > >>>> more. Furthermore, I consulted with my colleagues including > > > Matteo, > > > > but > > > > > >> we > > > > > >>>> couldn't see a way to scope it down. > > > > > >>>> Why? Because once you begin this journey, you need to know how > > > it's > > > > > >> going > > > > > >>>> to end. > > > > > >>>> What I ended up doing, is writing all the crucial details for > > > > review in > > > > > >>>> the High Level Design section. > > > > > >>>> It's still a big, hefty section, but I don't think I can step > > out > > > > or let > > > > > >>>> anyone else change Pulsar so invasively without the full > extent > > of > > > > the > > > > > >>>> change. > > > > > >>>> > > > > > >>>> I don't think it's wise to read parts. > > > > > >>>> I did my very best effort to minimize it, but the scope is > > simply > > > > big. > > > > > >>>> Open for suggestions, but it requires reading all the PIP :) > > > > > >>>> > > > > > >>>> Thanks a lot Yunze for dedicating any time to it. > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>>> > > > > > >>>>> Let's talk back to the proposal, for now, what I mainly > learned > > > and > > > > > >>>>> are concerned about mostly are: > > > > > >>>>> 1. Pulsar has many ways to expose metrics. It's not unified > and > > > > > >> confusing. > > > > > >>>>> 2. The current metrics system cannot support a large amount > of > > > > topics. > > > > > >>>>> 3. It's hard for plugin authors to integrate metrics. (For > > > example, > > > > > >>>>> KoP [2] integrates metrics by implementing the > > > > > >>>>> PrometheusRawMetricsProvider interface and it indeed needs > much > > > > work) > > > > > >>>>> > > > > > >>>>> Regarding the 1st issue, this proposal chooses OpenTelemetry > > > > (OTel). > > > > > >>>>> > > > > > >>>>> Regarding the 2nd issue, I scrolled to the "Why > OpenTelemetry?" > > > > > >>>>> section. It's still frustrating to see no answer. > Eventually, I > > > > found > > > > > >>>>> > > > > > >>>> > > > > > >>>> OpenTelemetry isn't the solution for large amount of topic. > > > > > >>>> The solution is described at > > > > > >>>> "Aggregate and Filtering to solve cardinality issues" section. > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>>> the explanation in the "What we need to fix in OpenTelemetry > - > > > > > >>>>> Performance" section. It seems that we still need some > > > > enhancements in > > > > > >>>>> OTel. In other words, currently OTel is not ready for > resolving > > > all > > > > > >>>>> these issues listed in the proposal but we believe it will. > > > > > >>>>> > > > > > >>>> > > > > > >>>> Let me rephrase "believe" --> we work together with the > > > maintainers > > > > to > > > > > >> do > > > > > >>>> it, yes. > > > > > >>>> I am open for any other suggestion. > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>>> > > > > > >>>>> As for the 3rd issue, from the "Integrating with Pulsar > > Plugins" > > > > > >>>>> section, the plugin authors still need to implement the new > > OTel > > > > > >>>>> interfaces. Is it much easier than using the existing ways to > > > > expose > > > > > >>>>> metrics? Could metrics still be easily integrated with > Grafana? > > > > > >>>>> > > > > > >>>> > > > > > >>>> Yes, it's way easier. > > > > > >>>> Basically you have a full fledged metrics library objects: > > Meter, > > > > Gauge, > > > > > >>>> Histogram, Counter. > > > > > >>>> No more Raw Metrics Provider, writing UTF-8 bytes in > Prometheus > > > > format. > > > > > >>>> You get namespacing for free with Meter name and version. > > > > > >>>> It's way better than current solution and any other library. > > > > > >>>> > > > > > >>>> > > > > > >>>>> > > > > > >>>>> That's all I am concerned about at the moment. I understand, > > and > > > > > >>>>> appreciate that you've spent much time studying and > explaining > > > all > > > > > >>>>> these things. But, this proposal is still too huge. > > > > > >>>>> > > > > > >>>> > > > > > >>>> I appreciate your effort a lot! > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>>> > > > > > >>>>> [1] > > > > https://lists.apache.org/thread/04jxqskcwwzdyfghkv4zstxxmzn154kf > > > > > >>>>> [2] > > > > > >>>>> > > > > > >> > > > > > > > > > > https://github.com/streamnative/kop/blob/master/kafka-impl/src/main/java/io/streamnative/pulsar/handlers/kop/stats/PrometheusMetricsProvider.java > > > > > >>>>> > > > > > >>>>> Thanks, > > > > > >>>>> Yunze > > > > > >>>>> > > > > > >>>>> On Sun, May 7, 2023 at 5:53 PM Asaf Mesika < > > > asaf.mes...@gmail.com> > > > > > >> wrote: > > > > > >>>>>> > > > > > >>>>>> I'm very appreciative for feedback from multiple pulsar > users > > > and > > > > devs > > > > > >>>>> on > > > > > >>>>>> this PIP, since it has dramatic changes suggested and quite > > > > extensive > > > > > >>>>>> positive change for the users. > > > > > >>>>>> > > > > > >>>>>> > > > > > >>>>>> On Thu, Apr 27, 2023 at 7:32 PM Asaf Mesika < > > > > asaf.mes...@gmail.com> > > > > > >>>>> wrote: > > > > > >>>>>> > > > > > >>>>>>> Hi all, > > > > > >>>>>>> > > > > > >>>>>>> I'm very excited to release a PIP I've been working on in > the > > > > past 11 > > > > > >>>>>>> months, which I think will be immensely valuable to Pulsar, > > > > which I > > > > > >>>>> like so > > > > > >>>>>>> much. > > > > > >>>>>>> > > > > > >>>>>>> PIP: https://github.com/apache/pulsar/issues/20197 > > > > > >>>>>>> > > > > > >>>>>>> I'm quoting here the preface: > > > > > >>>>>>> > > > > > >>>>>>> === QUOTE START === > > > > > >>>>>>> > > > > > >>>>>>> Roughly 11 months ago, I started working on solving the > > biggest > > > > issue > > > > > >>>>> with > > > > > >>>>>>> Pulsar metrics: the lack of ability to monitor a pulsar > > broker > > > > with a > > > > > >>>>> large > > > > > >>>>>>> topic count: 10k, 100k, and future support of 1M. This > > started > > > by > > > > > >>>>> mapping > > > > > >>>>>>> the existing functionality and then enumerating all the > > > problems > > > > I > > > > > >>>>> saw (all > > > > > >>>>>>> documented in this doc > > > > > >>>>>>> < > > > > > >>>>> > > > > > >> > > > > > > > > > > https://docs.google.com/document/d/1vke4w1nt7EEgOvEerPEUS-Al3aqLTm9cl2wTBkKNXUA/edit?usp=sharing > > > > > > > > > > I thought we were going to stop using Google docs for PIPs. > > > > > > > > > > >>>>>> > > > > > >>>>>>> ). > > > > > >>>>>>> > > > > > >>>>>>> This PIP is a parent PIP. It aims to gradually solve (using > > > > sub-PIPs) > > > > > >>>>> all > > > > > >>>>>>> the current metric system's problems and provide the > ability > > to > > > > > >>>>> monitor a > > > > > >>>>>>> broker with a large topic count, which is currently > lacking. > > > As a > > > > > >>>>> parent > > > > > >>>>>>> PIP, it will describe each problem and its solution at a > high > > > > level, > > > > > >>>>>>> leaving fine-grained details to the sub-PIPs. The parent > PIP > > > > ensures > > > > > >>>>> all > > > > > >>>>>>> solutions align and does not contradict each other. > > > > > >>>>>>> > > > > > >>>>>>> The basic building block to solve the monitoring ability of > > > large > > > > > >>>>> topic > > > > > >>>>>>> count is aggregating internally (to topic groups) and > adding > > > > > >>>>> fine-grained > > > > > >>>>>>> filtering. We could have shoe-horned it into the existing > > > metric > > > > > >>>>> system, > > > > > >>>>>>> but we thought adding that to a system already ingrained > with > > > > many > > > > > >>>>> problems > > > > > >>>>>>> would be wrong and hard to do gradually, as so many things > > will > > > > > >>>>> break. This > > > > > >>>>>>> is why the second-biggest design decision presented here is > > > > > >>>>> consolidating > > > > > >>>>>>> all existing metric libraries into a single one - > > OpenTelemetry > > > > > >>>>>>> <https://opentelemetry.io/>. The parent PIP will explain > why > > > > > >>>>>>> OpenTelemetry was chosen out of existing solutions and why > it > > > far > > > > > >>>>> exceeds > > > > > >>>>>>> all other options. I’ve been working closely with the > > > > OpenTelemetry > > > > > >>>>>>> community in the past eight months: brain-storming this > > > > integration, > > > > > >>>>> and > > > > > >>>>>>> raising issues, in an effort to remove serious blockers to > > make > > > > this > > > > > >>>>>>> migration successful. > > > > > >>>>>>> > > > > > >>>>>>> I made every effort to summarize this document so that it > can > > > be > > > > > >>>>> concise > > > > > >>>>>>> yet clear. I understand it is an effort to read it and, > more > > > so, > > > > > >>>>> provide > > > > > >>>>>>> meaningful feedback on such a large document; hence I’m > very > > > > grateful > > > > > >>>>> for > > > > > >>>>>>> each individual who does so. > > > > > >>>>>>> > > > > > >>>>>>> I think this design will help improve the user experience > > > > immensely, > > > > > >>>>> so it > > > > > >>>>>>> is worth the time spent reading it. > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> === QUOTE END === > > > > > >>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> Thanks! > > > > > >>>>>>> > > > > > >>>>>>> Asaf Mesika > > > > > >>>>>>> > > > > > >>>>> > > > > > >>>> > > > > > >> > > > > > >> > > > > > > > > > > > > > > >