Re: [DISCUSS] PIP-264: Enhanced OTel-based metric system

Asaf Mesika Mon, 28 Aug 2023 07:48:48 -0700

I've relocated the PIP content from the issue (
https://github.com/apache/pulsar/issues/20197) to a PR (
https://github.com/apache/pulsar/pull/21080) so I could add TOC and also be
inlined with the new process.




On Mon, Aug 28, 2023 at 5:46 PM Asaf Mesika <asaf.mes...@gmail.com> wrote:

> Thanks for taking the time to review the document - *highly appreciated*.
> I'm inlined my comments below.
>
>
> On Mon, Aug 21, 2023 at 12:19 PM Hang Chen <chenh...@apache.org> wrote:
>
>> Hi Asaf,
>>     Thanks for bring up the great proposal.
>>
>> After reading this proposal, I have the following questions.
>> 1. This proposal will introduce a big break change in Pulsar,
>> especially in code perspective. I’m interested in how to support both
>> old and new implementation at the same time step by step?
>>
>> >We will keep the current metric system as is, and add a new layer of
>> metrics using OpenTelemetry Java SDK. All of Pulsar’s metrics will be
>> create also using OpenTelemetry. A feature flag will allow enabling
>> OpenTelemetry metrics (init, recording and exporting). All the features and
>> changes described here will be done only in the OpenTelemetry layer,
>> allowing to keep the old version working until you’re ready to switch using
>> the OTel (OpenTelemetry) implementation. In the far future, once OTel usage
>> has stabilized and became widely adopted we’ll deprecate current metric
>> system and eventually remove it. We will also make sure there is feature
>> flag to turn off current Prometheus based metric system.
>>
>>
> Current metrics code remains as is, untouched.
> I'm adding new code, using OpenTelemetry API and SDK. The code in most
> cases will read the existing variables (like msgsReceived), and in other
> cases will setup its own new objects like Counter, Histogram and *also*
> record values to them.
> You can take a look at the revised PIP as I've added tiny code sample to
> be use as an idea how it will look like. Look here
> <https://github.com/apache/pulsar/blob/6ec0bde4127a54ab8e8bb67fb091c932fa2952a4/pip/pip-264.md#consolidating-to-opentelemetry>
> .
>
>
>
>> 2. We introduced Group and filter logic in the metric system, and I
>> have the following concerns.
>> - We need to add protection logic or pre-validation for the group and
>> filter rules to avoid users mis-configured causes huge performance
>> impaction on Pulsar brokers
>>
>>
> Good call. I've added a note in the PIP, that we will reject any filter
> rules update if the expected number of data points exceed certain
> threshold. I left this as detail to be specified in the sub-PIP.
>
> - We need to support expose all the topic-level metrics when the
>> Pulsar cluster just has thounds of topics
>>
>> I've added a new goal: "- New system should support the maximum
> supported number of topics in current system (i.e. 4k topics) without
> filtering"
>
>
>> - Even though we introduced group and filter for the metrics, we still
>> can’t resolve large number of metrics exposed to Prometheus. Exposing
>> large a mount of data (100MB+) throughput HTTP endpoint in
>> ineffective. We can consider expose those metric data by Pulsar topic
>> and develop a Pulsar to Prometheus connector to write Pulsar metric
>> data to Prometheus in streaming mode instead of batch mode to reduce
>> the performace impaction
>>
>
> As I wrote in my PIP. If you find your self exporting 100MB or more of
> metric data *every* 30 seconds you will suffer from:
> * High cost of TSDB holding that (e.g. Prometheus, Cortex, VictoriaMetrics)
> * Query time out since there is too much data to read
>
> Also, the bottleneck is not transfer time over the wire. It's mostly the
> memory needed by any TSDB to hold it for at least 2 hours before flushing
> it to disk - this it the most expensive of all.
>
> At 100mb response size, filtering and grouping are a must.
>
>
>
>>
>> - Group and filter logic uses regular expressions extensively in
>> rules. Regular expression parsing and matching are CPU and time
>> intensive operations. We have push-down filter to reduce the generated
>> metrics number, but still can’t solve the regular expression matching
>> issues. If the user provide a complex regular expression for group and
>> filter rule, the metric generating thread will be the bottleneck and
>> will block other threads if we use synchronous call.
>>
>>
> I plan to use caching as wrote in the PIP. Roughly (instrument,
> attributes) -> boolean. It's basically as if we are adding one boolean to
> PersistentTopic class - it has so many properties and size added is
> negligible.
>
>
>> - Group and filter rule is a litter complex for users and we need to
>> provide a UI or tool to help user write the correct and effective
>> rules and show the new rules impaction on old rules.
>>
>
> We don't have UI for now in Pulsar. We will make sure pulsar CLI will be
> convenient enough.
>
>
>>
>>
>> Thanks,
>> Hang
>>
>> Matteo Merli <matteo.me...@gmail.com> 于2023年6月15日周四 23:14写道：
>> >
>> > > Proposing a large breaking change (even if it's crucial) is the single
>> > fastest way to motivate your users to migrate to a different platform. I
>> > wish it wasn't the case, but it's the cold reality.
>> >
>> > If you read the proposal, there is no real breaking change. There will
>> be a
>> > switch to choose the existing metrics or the new ones. The dashboards
>> will
>> > be updated and provided.
>> >
>> > At the same time, the best sure way to motivate users to switch or not
>> > adopt a platform is to stick with confusing/inconsistent APIs/Metrics.
>> >
>> >
>> > --
>> > Matteo Merli
>> > <matteo.me...@gmail.com>
>> >
>> >
>> > On Wed, Jun 14, 2023 at 6:10 PM Devin Bost <devin.b...@gmail.com>
>> wrote:
>> >
>> > > > Thanks for the details, Devin. Curios - 'We' stands for which
>> company?
>> > >
>> > > What do you mean? I was quoting Rajan when I said, "we."
>> > >
>> > >
>> > > Devin G. Bost
>> > >
>> > >
>> > > On Wed, Jun 14, 2023 at 10:02 AM Asaf Mesika <asaf.mes...@gmail.com>
>> > > wrote:
>> > >
>> > > > Thanks for the details, Devin. Curios - 'We' stands for which
>> company?
>> > > >
>> > > > Can you take a look at my previous response to see if it answers the
>> > > > concern you raised?
>> > > >
>> > > > Thanks!
>> > > >
>> > > >
>> > > > On Wed, Jun 14, 2023 at 1:49 PM Devin Bost <devin.b...@gmail.com>
>> wrote:
>> > > >
>> > > > > > Hi,
>> > > > > >
>> > > > > > Are we proposing a change to break existing metrics
>> compatibility
>> > > > > > (prometheus)? If that is the case then it's a big red flag as
>> it will
>> > > > be
>> > > > > a
>> > > > > > pain for any company to upgrade Pulsar as monitoring is THE most
>> > > > > important
>> > > > > > part of the system and we don't even want to break
>> compatibility for
>> > > > any
>> > > > > > small things to avoid interruption for users that are using
>> Pulsar
>> > > > > system.
>> > > > > > I think it's always good to enhance a system by maintaining
>> > > > compatibility
>> > > > > > and I would be fine if we can introduce new metrics API without
>> > > causing
>> > > > > ANY
>> > > > > > interruption to existing metrics API. But if we can't maintain
>> > > > > > compatibility then it's a big red flag and not acceptable for
>> the
>> > > > Pulsar
>> > > > > > community.
>> > > > >
>> > > > > Proposing a large breaking change (even if it's crucial) is the
>> single
>> > > > > fastest way to motivate your users to migrate to a different
>> platform.
>> > > I
>> > > > > wish it wasn't the case, but it's the cold reality.
>> > > > >
>> > > > > With that said, I'm a big proponent of Open Telemetry. I did a big
>> > > video
>> > > > a
>> > > > > while back that some of you may remember on the use of Open
>> Tracing
>> > > > (before
>> > > > > it was merged into Open Telemetry). Open Telemetry has gained
>> > > > considerable
>> > > > > momentum in the industry since then.
>> > > > >
>> > > > > I'm also very interested in a solution to the metrics problem.
>> I've run
>> > > > > into the scalability issues with metrics in production, and I've
>> been
>> > > > very
>> > > > > concerned about the metrics bottlenecks around our ability to
>> deliver
>> > > our
>> > > > > promises around supporting large numbers of topics. One of the big
>> > > > > advantages of Pulsar over Kafka is supposed to be that topics are
>> > > cheap,
>> > > > > but as it stands, our current metrics design gets seriously in
>> the way
>> > > of
>> > > > > that. Generally speaking, I'm open to solutions, especially if
>> they
>> > > align
>> > > > > us with a growing industry standard.
>> > > > >
>> > > > > - Devin
>> > > > >
>> > > > >
>> > > > > On Wed, Jun 14, 2023, 3:28 AM Enrico Olivelli <
>> eolive...@gmail.com>
>> > > > wrote:
>> > > > >
>> > > > > > Il Mer 14 Giu 2023, 04:33 Rajan Dhabalia <rdhaba...@apache.org>
>> ha
>> > > > > > scritto:
>> > > > > >
>> > > > > > > Hi,
>> > > > > > >
>> > > > > > > Are we proposing a change to break existing metrics
>> compatibility
>> > > > > > > (prometheus)? If that is the case then it's a big red flag as
>> it
>> > > will
>> > > > > be
>> > > > > > a
>> > > > > > > pain for any company to upgrade Pulsar as monitoring is THE
>> most
>> > > > > > important
>> > > > > > > part of the system and we don't even want to break
>> compatibility
>> > > for
>> > > > > any
>> > > > > > > small things to avoid interruption for users that are using
>> Pulsar
>> > > > > > system.
>> > > > > > > I think it's always good to enhance a system by maintaining
>> > > > > compatibility
>> > > > > > > and I would be fine if we can introduce new metrics API
>> without
>> > > > causing
>> > > > > > ANY
>> > > > > > > interruption to existing metrics API. But if we can't maintain
>> > > > > > > compatibility then it's a big red flag and not acceptable for
>> the
>> > > > > Pulsar
>> > > > > > > community.
>> > > > > > >
>> > > > > >
>> > > > > > I agree.
>> > > > > >
>> > > > > > If it is possible to export data Ina way that is compatible with
>> > > > > Prometheus
>> > > > > > without adding too much overhead then I would support this work.
>> > > > > >
>> > > > > > About renaming the metrics: we can do it only if tue changes for
>> > > users
>> > > > > are
>> > > > > > as trivial as replacing the queries in the grafana dashboard or
>> in
>> > > > > alerting
>> > > > > > systems.
>> > > > > >
>> > > > > > Asaf, do you have prototype? Built over any version of Pulsar?
>> > > > > >
>> > > > > > Also, it would be very useful to start an initiative to collect
>> the
>> > > > list
>> > > > > of
>> > > > > > metrics that people really use in production, especially for
>> > > automated
>> > > > > > alerts.
>> > > > > >
>> > > > > > In my experience you usually care about:
>> > > > > > - in/out traffic (rates, bytes...)
>> > > > > > - number of producer, consumers, topics, subscriptions...
>> > > > > > - backlog
>> > > > > > - jvm metrics
>> > > > > > - function custom metrics
>> > > > > >
>> > > > > >
>> > > > > > Enrico
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > > Thanks,
>> > > > > > > Rajan
>> > > > > > >
>> > > > > > > On Sun, May 21, 2023 at 9:01 AM Asaf Mesika <
>> asaf.mes...@gmail.com
>> > > >
>> > > > > > wrote:
>> > > > > > >
>> > > > > > > > Thanks for the reply, Enrico.
>> > > > > > > > Completely agree.
>> > > > > > > > This made me realize my TL;DR wasn't talking about export.
>> > > > > > > > I added this to it:
>> > > > > > > >
>> > > > > > > > ---
>> > > > > > > > Pulsar OTel Metrics will support exporting as Prometheus
>> HTTP
>> > > > > endpoint
>> > > > > > > > (`/metrics` but different port) for backward compatibility
>> and
>> > > also
>> > > > > > OLTP,
>> > > > > > > > so you can push the metrics to OTel Collector and from
>> there ship
>> > > > it
>> > > > > to
>> > > > > > > any
>> > > > > > > > destination.
>> > > > > > > > ---
>> > > > > > > >
>> > > > > > > > OTel supports two kinds of exporter: Prometheus (HTTP) and
>> OTLP
>> > > > > (push).
>> > > > > > > > We'll just configure to use them.
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > On Mon, May 15, 2023 at 10:35 AM Enrico Olivelli <
>> > > > > eolive...@gmail.com>
>> > > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Asaf,
>> > > > > > > > > thanks for contributing in this area.
>> > > > > > > > > Metrics are a fundamental feature of Pulsar.
>> > > > > > > > >
>> > > > > > > > > Currently I find it very awkward to maintain metrics, and
>> also
>> > > I
>> > > > > see
>> > > > > > > > > it as a problem to support only Prometheus.
>> > > > > > > > >
>> > > > > > > > > Regarding your proposal, IIRC in the past someone else
>> proposed
>> > > > to
>> > > > > > > > > support other metrics systems and they have been
>> suggested to
>> > > > use a
>> > > > > > > > > sidecar approach,
>> > > > > > > > > that is to add something next to Pulsar services that
>> served
>> > > the
>> > > > > > > > > metrics in the preferred format/way.
>> > > > > > > > > I find that the sidecar approach is too inefficient and I
>> am
>> > > not
>> > > > > > > > > proposing it (but I wanted to add this reference for the
>> > > benefit
>> > > > of
>> > > > > > > > > new people on the list).
>> > > > > > > > >
>> > > > > > > > > I wonder if it would be possible to keep compatibility
>> with the
>> > > > > > > > > current Prometheus based metrics.
>> > > > > > > > > Now Pulsar reached a point in which is is widely used by
>> many
>> > > > > > > > > companies and also with big clusters,
>> > > > > > > > > telling people that they have to rework all the
>> infrastructure
>> > > > > > related
>> > > > > > > > > to metrics because we don't support Prometheus anymore or
>> > > because
>> > > > > we
>> > > > > > > > > changed radically the way we publish metrics
>> > > > > > > > > It is a step that seems too hard from my point of view.
>> > > > > > > > >
>> > > > > > > > > Currently I believe that compatibility is more important
>> than
>> > > > > > > > > versatility, and if we want to introduce new (and far
>> better)
>> > > > > > features
>> > > > > > > > > we must take it into account.
>> > > > > > > > >
>> > > > > > > > > So my point is that I generally support the idea of
>> opening the
>> > > > way
>> > > > > > to
>> > > > > > > > > Open Telemetry, but we must have a way to not force all
>> of our
>> > > > > users
>> > > > > > > > > to throw away their alerting systems, dashboards and
>> know-how
>> > > in
>> > > > > > > > > troubleshooting Pulsar problems in production and dev
>> > > > > > > > >
>> > > > > > > > > Best regards
>> > > > > > > > > Enrico
>> > > > > > > > >
>> > > > > > > > > Il giorno lun 15 mag 2023 alle ore 02:17 Dave Fisher
>> > > > > > > > > <wave4d...@comcast.net> ha scritto:
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > > On May 10, 2023, at 1:01 AM, Asaf Mesika <
>> > > > > asaf.mes...@gmail.com>
>> > > > > > > > > wrote:
>> > > > > > > > > > >
>> > > > > > > > > > > On Tue, May 9, 2023 at 11:29 PM Dave Fisher <
>> > > > w...@apache.org>
>> > > > > > > > wrote:
>> > > > > > > > > > >
>> > > > > > > > > > >>
>> > > > > > > > > > >>
>> > > > > > > > > > >>>> On May 8, 2023, at 2:49 AM, Asaf Mesika <
>> > > > > > asaf.mes...@gmail.com>
>> > > > > > > > > wrote:
>> > > > > > > > > > >>>
>> > > > > > > > > > >>> Your feedback made me realized I need to add "TL;DR"
>> > > > section,
>> > > > > > > > which I
>> > > > > > > > > > >> just
>> > > > > > > > > > >>> added.
>> > > > > > > > > > >>>
>> > > > > > > > > > >>> I'm quoting it here. It gives a brief summary of the
>> > > > > proposal,
>> > > > > > > > which
>> > > > > > > > > > >>> requires up to 5 min of read time, helping you get
>> a high
>> > > > > level
>> > > > > > > > > picture
>> > > > > > > > > > >>> before you dive into the
>> background/motivation/solution.
>> > > > > > > > > > >>>
>> > > > > > > > > > >>> ----------------------
>> > > > > > > > > > >>> TL;DR
>> > > > > > > > > > >>>
>> > > > > > > > > > >>> Working with Metrics today as a user or a developer
>> is
>> > > hard
>> > > > > and
>> > > > > > > has
>> > > > > > > > > many
>> > > > > > > > > > >>> severe issues.
>> > > > > > > > > > >>>
>> > > > > > > > > > >>> From the user perspective:
>> > > > > > > > > > >>>
>> > > > > > > > > > >>>  - One of Pulsar strongest feature is "cheap"
>> topics so
>> > > you
>> > > > > can
>> > > > > > > > > easily
>> > > > > > > > > > >>>  have 10k - 100k topics per broker. Once you do
>> that, you
>> > > > > > quickly
>> > > > > > > > > learn
>> > > > > > > > > > >> that
>> > > > > > > > > > >>>  the amount of metrics you export via "/metrics"
>> > > > (Prometheus
>> > > > > > > style
>> > > > > > > > > > >> endpoint)
>> > > > > > > > > > >>>  becomes really big. The cost to store them becomes
>> too
>> > > > high,
>> > > > > > > > queries
>> > > > > > > > > > >>>  time-out or even "/metrics" endpoint it self times
>> out.
>> > > > > > > > > > >>>  The only option Pulsar gives you today is
>> all-or-nothing
>> > > > > > > filtering
>> > > > > > > > > and
>> > > > > > > > > > >>>  very crude aggregation. You switch metrics from
>> topic
>> > > > > > > aggregation
>> > > > > > > > > > >> level to
>> > > > > > > > > > >>>  namespace aggregation level. Also you can turn off
>> > > > producer
>> > > > > > and
>> > > > > > > > > > >> consumer
>> > > > > > > > > > >>>  level metrics. You end up doing it all leaving you
>> > > > "blind",
>> > > > > > > > looking
>> > > > > > > > > at
>> > > > > > > > > > >> the
>> > > > > > > > > > >>>  metrics from a namespace level which is too high
>> level.
>> > > > You
>> > > > > > end
>> > > > > > > up
>> > > > > > > > > > >>>  conjuring all kinds of scripts on top of topic
>> stats
>> > > > > endpoint
>> > > > > > to
>> > > > > > > > > glue
>> > > > > > > > > > >> some
>> > > > > > > > > > >>>  aggregated metrics view for the topics you need.
>> > > > > > > > > > >>>  - Summaries (metric type giving you quantiles like
>> p95)
>> > > > > which
>> > > > > > > are
>> > > > > > > > > used
>> > > > > > > > > > >>>  in Pulsar, can't be aggregated across topics /
>> brokers
>> > > due
>> > > > > its
>> > > > > > > > > inherent
>> > > > > > > > > > >>>  design.
>> > > > > > > > > > >>>  - Plugin authors spend too much time on defining
>> and
>> > > > > exposing
>> > > > > > > > > metrics
>> > > > > > > > > > >> to
>> > > > > > > > > > >>>  Pulsar since the only interface Pulsar offers is
>> writing
>> > > > > your
>> > > > > > > > > metrics
>> > > > > > > > > > >> by
>> > > > > > > > > > >>>  your self as UTF-8 bytes in Prometheus Text Format
>> to
>> > > byte
>> > > > > > > stream
>> > > > > > > > > > >> interface
>> > > > > > > > > > >>>  given to you.
>> > > > > > > > > > >>>  - Pulsar histograms are exported in a way that is
>> not
>> > > > > > conformant
>> > > > > > > > > with
>> > > > > > > > > > >>>  Prometheus, which means you can't get the p95
>> quantile
>> > > on
>> > > > > such
>> > > > > > > > > > >> histograms,
>> > > > > > > > > > >>>  making them very hard to use in day to day life.
>> > > > > > > > > > >>
>> > > > > > > > > > >> What version of DataSketches is used to produce the
>> > > > histogram?
>> > > > > > Is
>> > > > > > > is
>> > > > > > > > > still
>> > > > > > > > > > >> an old Yahoo one, or are we using an updated one from
>> > > Apache
>> > > > > > > > > DataSketches?
>> > > > > > > > > > >>
>> > > > > > > > > > >> Seems like this is a single PR/small PIP for 3.1?
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > Histograms are a list of buckets, each is a counter.
>> > > > > > > > > > > Summary is a collection of values collected over a
>> time
>> > > > window,
>> > > > > > > which
>> > > > > > > > > at
>> > > > > > > > > > > the end you get a calculation of the quantiles of
>> those
>> > > > values:
>> > > > > > > p95,
>> > > > > > > > > p50,
>> > > > > > > > > > > and those are exported from Pulsar.
>> > > > > > > > > > >
>> > > > > > > > > > > Pulsar histogram do not use Data Sketches.
>> > > > > > > > > >
>> > > > > > > > > > Bookkeeper Metrics wraps Yahoo DataSketches last I
>> checked.
>> > > > > > > > > >
>> > > > > > > > > > > They are just counters.
>> > > > > > > > > > > They are not adhere to Prometheus since:
>> > > > > > > > > > > a. The counter is expected to be cumulative, but
>> Pulsar
>> > > > resets
>> > > > > > each
>> > > > > > > > > bucket
>> > > > > > > > > > > counter to 0 every 1 min
>> > > > > > > > > > > b. The bucket upper range is expected to be written
>> as an
>> > > > > > attribute
>> > > > > > > > > "le"
>> > > > > > > > > > > but today it is encoded in the name of the metric
>> itself.
>> > > > > > > > > > >
>> > > > > > > > > > > This is a breaking change, hence hard to mark in any
>> small
>> > > > > > release.
>> > > > > > > > > > > This is why it's part of this PIP since so many
>> things will
>> > > > > > break,
>> > > > > > > > and
>> > > > > > > > > all
>> > > > > > > > > > > of them will break on a separate layer (OTel metrics),
>> > > hence
>> > > > > not
>> > > > > > > > break
>> > > > > > > > > > > anyone without their consent.
>> > > > > > > > > >
>> > > > > > > > > > If this change will break existing Grafana dashboards
>> and
>> > > other
>> > > > > > > > > operational monitoring already in place then it will break
>> > > > > guarantees
>> > > > > > > we
>> > > > > > > > > have made about safely being able to downgrade from a bad
>> > > > upgrade.
>> > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > >>
>> > > > > > > > > > >>
>> > > > > > > > > > >>>  - Too many metrics are rates which also delta reset
>> > > every
>> > > > > > > interval
>> > > > > > > > > you
>> > > > > > > > > > >>>  configure in Pulsar and restart, instead of
>> relying on
>> > > > > > > cumulative
>> > > > > > > > > (ever
>> > > > > > > > > > >>>  growing) counters and let Prometheus use its rate
>> > > > function.
>> > > > > > > > > > >>>  - and many more issues
>> > > > > > > > > > >>>
>> > > > > > > > > > >>> From the developer perspective:
>> > > > > > > > > > >>>
>> > > > > > > > > > >>>  - There are 4 different ways to define and record
>> > > metrics
>> > > > in
>> > > > > > > > Pulsar:
>> > > > > > > > > > >>>  Pulsar own metrics library, Prometheus Java Client,
>> > > > > Bookkeeper
>> > > > > > > > > metrics
>> > > > > > > > > > >>>  library and plain native Java SDK objects
>> (AtomicLong,
>> > > > ...).
>> > > > > > > It's
>> > > > > > > > > very
>> > > > > > > > > > >>>  confusing for the developer and create
>> inconsistencies
>> > > for
>> > > > > the
>> > > > > > > end
>> > > > > > > > > user
>> > > > > > > > > > >>>  (e.g. Summary for example is different in each).
>> > > > > > > > > > >>>  - Patching your metrics into "/metrics" Prometheus
>> > > > endpoint
>> > > > > is
>> > > > > > > > > > >>>  confusing, cumbersome and error prone.
>> > > > > > > > > > >>>  - many more
>> > > > > > > > > > >>>
>> > > > > > > > > > >>> This proposal offers several key changes to solve
>> that:
>> > > > > > > > > > >>>
>> > > > > > > > > > >>>  - Cardinality (supporting 10k-100k topics per
>> broker) is
>> > > > > > solved
>> > > > > > > by
>> > > > > > > > > > >>>  introducing a new aggregation level for metrics
>> called
>> > > > Topic
>> > > > > > > > Metric
>> > > > > > > > > > >> Group.
>> > > > > > > > > > >>>  Using configuration, you specify for each topic its
>> > > group
>> > > > > > (using
>> > > > > > > > > > >>>  wildcard/regex). This allows you to "zoom" out to
>> a more
>> > > > > > > detailed
>> > > > > > > > > > >>>  granularity level like groups instead of
>> namespaces,
>> > > which
>> > > > > you
>> > > > > > > > > control
>> > > > > > > > > > >> how
>> > > > > > > > > > >>>  many groups you'll have hence solving the
>> cardinality
>> > > > issue,
>> > > > > > > > without
>> > > > > > > > > > >>>  sacrificing level of detail too much.
>> > > > > > > > > > >>>  - Fine-grained filtering mechanism, dynamic.
>> You'll have
>> > > > > > > > rule-based
>> > > > > > > > > > >>>  dynamic configuration, allowing you to specify per
>> > > > > > > > > > >> namespace/topic/group
>> > > > > > > > > > >>>  which metrics you'd like to keep/drop. Rules
>> allows you
>> > > to
>> > > > > set
>> > > > > > > the
>> > > > > > > > > > >> default
>> > > > > > > > > > >>>  to have small amount of metrics in group and
>> namespace
>> > > > level
>> > > > > > > only
>> > > > > > > > > and
>> > > > > > > > > > >> drop
>> > > > > > > > > > >>>  the rest. When needed, you can add an override
>> rule to
>> > > > > "open"
>> > > > > > > up a
>> > > > > > > > > > >> certain
>> > > > > > > > > > >>>  group to have more metrics in higher granularity
>> (topic
>> > > or
>> > > > > > even
>> > > > > > > > > > >>>  consumer/producer level). Since it's dynamic you
>> "open"
>> > > > > such a
>> > > > > > > > group
>> > > > > > > > > > >> when
>> > > > > > > > > > >>>  you see it's misbehaving, see it in topic level,
>> and
>> > > when
>> > > > > all
>> > > > > > > > > > >> resolved, you
>> > > > > > > > > > >>>  can "close" it. A bit similar experience to logging
>> > > levels
>> > > > > in
>> > > > > > > > Log4j
>> > > > > > > > > or
>> > > > > > > > > > >>>  Logback, that you default and override per
>> > > class/package.
>> > > > > > > > > > >>>
>> > > > > > > > > > >>> Aggregation and Filtering combined solves the
>> cardinality
>> > > > > > without
>> > > > > > > > > > >>> sacrificing the level of detail when needed and most
>> > > > > > importantly,
>> > > > > > > > you
>> > > > > > > > > > >>> determine which topic/group/namespace it happens on.
>> > > > > > > > > > >>>
>> > > > > > > > > > >>> Since this change is so invasive, it requires a
>> single
>> > > > > metrics
>> > > > > > > > > library to
>> > > > > > > > > > >>> implement all of it on top of; Hence the third big
>> change
>> > > > > point
>> > > > > > > is
>> > > > > > > > > > >>> consolidating all four ways to define and record
>> metrics
>> > > > to a
>> > > > > > > > single
>> > > > > > > > > > >> one, a
>> > > > > > > > > > >>> new one: OpenTelemtry Metrics (Java SDK, and also
>> Python
>> > > > and
>> > > > > Go
>> > > > > > > for
>> > > > > > > > > the
>> > > > > > > > > > >>> Pulsar Function runners).
>> > > > > > > > > > >>> Introducing OpenTelemetry (OTel) solves also the
>> biggest
>> > > > pain
>> > > > > > > point
>> > > > > > > > > from
>> > > > > > > > > > >>> the developer perspective, since it's a superb
>> metrics
>> > > > > library
>> > > > > > > > > offering
>> > > > > > > > > > >>> everything you need, and there is going to be a
>> single
>> > > way
>> > > > -
>> > > > > > only
>> > > > > > > > it.
>> > > > > > > > > > >> Also,
>> > > > > > > > > > >>> it solves the robustness for Plugin author which
>> will use
>> > > > > > > > > OpenTelemetry.
>> > > > > > > > > > >> It
>> > > > > > > > > > >>> so happens that it also solves all the numerous
>> problems
>> > > > > > > described
>> > > > > > > > > in the
>> > > > > > > > > > >>> doc itself.
>> > > > > > > > > > >>>
>> > > > > > > > > > >>> The solution will be introduced as another layer
>> with
>> > > > feature
>> > > > > > > > > toggles, so
>> > > > > > > > > > >>> you can work with existing system, and/or OTel,
>> until
>> > > > > gradually
>> > > > > > > > > > >> deprecating
>> > > > > > > > > > >>> existing system.
>> > > > > > > > > > >>>
>> > > > > > > > > > >>> It's a big breaking change for Pulsar users on many
>> > > fronts:
>> > > > > > > names,
>> > > > > > > > > > >>> semantics, configuration. Read at the end of this
>> doc to
>> > > > > learn
>> > > > > > > > > exactly
>> > > > > > > > > > >> what
>> > > > > > > > > > >>> will change for the user (in high level).
>> > > > > > > > > > >>>
>> > > > > > > > > > >>> In my opinion, it will make Pulsar user experience
>> so
>> > > much
>> > > > > > > better,
>> > > > > > > > > they
>> > > > > > > > > > >>> will want to migrate to it, despite the breaking
>> change.
>> > > > > > > > > > >>>
>> > > > > > > > > > >>> This was a very short summary. You are most
>> welcomed to
>> > > > read
>> > > > > > the
>> > > > > > > > full
>> > > > > > > > > > >>> design document below and express feedback, so we
>> can
>> > > make
>> > > > it
>> > > > > > > > better.
>> > > > > > > > > > >>>
>> > > > > > > > > > >>> On Sun, May 7, 2023 at 7:52 PM Asaf Mesika <
>> > > > > > > asaf.mes...@gmail.com>
>> > > > > > > > > > >> wrote:
>> > > > > > > > > > >>>
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>> On Sun, May 7, 2023 at 4:23 PM Yunze Xu
>> > > > > > > > > <y...@streamnative.io.invalid>
>> > > > > > > > > > >>>> wrote:
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>>> I'm excited to learn much more about metrics when
>> I
>> > > > started
>> > > > > > > > reading
>> > > > > > > > > > >>>>> this proposal. But I became more and more
>> frustrated
>> > > > when I
>> > > > > > > found
>> > > > > > > > > > >>>>> there is still too much content left even if I've
>> > > already
>> > > > > > spent
>> > > > > > > > > much
>> > > > > > > > > > >>>>> time reading this proposal. I'm wondering how
>> much time
>> > > > did
>> > > > > > you
>> > > > > > > > > expect
>> > > > > > > > > > >>>>> reviewers to read through this proposal? I just
>> > > recalled
>> > > > > the
>> > > > > > > > > > >>>>> discussion you started before [1]. Did you expect
>> each
>> > > > PMC
>> > > > > > > member
>> > > > > > > > > that
>> > > > > > > > > > >>>>> gives his/her +1 to read only parts of this
>> proposal?
>> > > > > > > > > > >>>>>
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>> I estimated around 2 hours needed for a reviewer.
>> > > > > > > > > > >>>> I hate it being so long, but I simply couldn't
>> find a
>> > > way
>> > > > to
>> > > > > > > > > downsize it
>> > > > > > > > > > >>>> more. Furthermore, I consulted with my colleagues
>> > > > including
>> > > > > > > > Matteo,
>> > > > > > > > > but
>> > > > > > > > > > >> we
>> > > > > > > > > > >>>> couldn't see a way to scope it down.
>> > > > > > > > > > >>>> Why? Because once you begin this journey, you need
>> to
>> > > know
>> > > > > how
>> > > > > > > > it's
>> > > > > > > > > > >> going
>> > > > > > > > > > >>>> to end.
>> > > > > > > > > > >>>> What I ended up doing, is writing all the crucial
>> > > details
>> > > > > for
>> > > > > > > > > review in
>> > > > > > > > > > >>>> the High Level Design section.
>> > > > > > > > > > >>>> It's still a big, hefty section, but I don't think
>> I can
>> > > > > step
>> > > > > > > out
>> > > > > > > > > or let
>> > > > > > > > > > >>>> anyone else change Pulsar so invasively without
>> the full
>> > > > > > extent
>> > > > > > > of
>> > > > > > > > > the
>> > > > > > > > > > >>>> change.
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>> I don't think it's wise to read parts.
>> > > > > > > > > > >>>> I did my very best effort to minimize it, but the
>> scope
>> > > is
>> > > > > > > simply
>> > > > > > > > > big.
>> > > > > > > > > > >>>> Open for suggestions, but it requires reading all
>> the
>> > > PIP
>> > > > :)
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>> Thanks a lot Yunze for dedicating any time to it.
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>>>
>> > > > > > > > > > >>>>> Let's talk back to the proposal, for now, what I
>> mainly
>> > > > > > learned
>> > > > > > > > and
>> > > > > > > > > > >>>>> are concerned about mostly are:
>> > > > > > > > > > >>>>> 1. Pulsar has many ways to expose metrics. It's
>> not
>> > > > unified
>> > > > > > and
>> > > > > > > > > > >> confusing.
>> > > > > > > > > > >>>>> 2. The current metrics system cannot support a
>> large
>> > > > amount
>> > > > > > of
>> > > > > > > > > topics.
>> > > > > > > > > > >>>>> 3. It's hard for plugin authors to integrate
>> metrics.
>> > > > (For
>> > > > > > > > example,
>> > > > > > > > > > >>>>> KoP [2] integrates metrics by implementing the
>> > > > > > > > > > >>>>> PrometheusRawMetricsProvider interface and it
>> indeed
>> > > > needs
>> > > > > > much
>> > > > > > > > > work)
>> > > > > > > > > > >>>>>
>> > > > > > > > > > >>>>> Regarding the 1st issue, this proposal chooses
>> > > > > OpenTelemetry
>> > > > > > > > > (OTel).
>> > > > > > > > > > >>>>>
>> > > > > > > > > > >>>>> Regarding the 2nd issue, I scrolled to the "Why
>> > > > > > OpenTelemetry?"
>> > > > > > > > > > >>>>> section. It's still frustrating to see no answer.
>> > > > > > Eventually, I
>> > > > > > > > > found
>> > > > > > > > > > >>>>>
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>> OpenTelemetry isn't the solution for large amount
>> of
>> > > > topic.
>> > > > > > > > > > >>>> The solution is described at
>> > > > > > > > > > >>>> "Aggregate and Filtering to solve cardinality
>> issues"
>> > > > > section.
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>>> the explanation in the "What we need to fix in
>> > > > > OpenTelemetry
>> > > > > > -
>> > > > > > > > > > >>>>> Performance" section. It seems that we still need
>> some
>> > > > > > > > > enhancements in
>> > > > > > > > > > >>>>> OTel. In other words, currently OTel is not ready
>> for
>> > > > > > resolving
>> > > > > > > > all
>> > > > > > > > > > >>>>> these issues listed in the proposal but we
>> believe it
>> > > > will.
>> > > > > > > > > > >>>>>
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>> Let me rephrase "believe" --> we work together
>> with the
>> > > > > > > > maintainers
>> > > > > > > > > to
>> > > > > > > > > > >> do
>> > > > > > > > > > >>>> it, yes.
>> > > > > > > > > > >>>> I am open for any other suggestion.
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>>>
>> > > > > > > > > > >>>>> As for the 3rd issue, from the "Integrating with
>> Pulsar
>> > > > > > > Plugins"
>> > > > > > > > > > >>>>> section, the plugin authors still need to
>> implement the
>> > > > new
>> > > > > > > OTel
>> > > > > > > > > > >>>>> interfaces. Is it much easier than using the
>> existing
>> > > > ways
>> > > > > to
>> > > > > > > > > expose
>> > > > > > > > > > >>>>> metrics? Could metrics still be easily integrated
>> with
>> > > > > > Grafana?
>> > > > > > > > > > >>>>>
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>> Yes, it's way easier.
>> > > > > > > > > > >>>> Basically you have a full fledged metrics library
>> > > objects:
>> > > > > > > Meter,
>> > > > > > > > > Gauge,
>> > > > > > > > > > >>>> Histogram, Counter.
>> > > > > > > > > > >>>> No more Raw Metrics Provider, writing UTF-8 bytes
>> in
>> > > > > > Prometheus
>> > > > > > > > > format.
>> > > > > > > > > > >>>> You get namespacing for free with Meter name and
>> > > version.
>> > > > > > > > > > >>>> It's way better than current solution and any other
>> > > > library.
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>>>
>> > > > > > > > > > >>>>> That's all I am concerned about at the moment. I
>> > > > > understand,
>> > > > > > > and
>> > > > > > > > > > >>>>> appreciate that you've spent much time studying
>> and
>> > > > > > explaining
>> > > > > > > > all
>> > > > > > > > > > >>>>> these things. But, this proposal is still too
>> huge.
>> > > > > > > > > > >>>>>
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>> I appreciate your effort a lot!
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>>>>
>> > > > > > > > > > >>>>> [1]
>> > > > > > > > >
>> > > https://lists.apache.org/thread/04jxqskcwwzdyfghkv4zstxxmzn154kf
>> > > > > > > > > > >>>>> [2]
>> > > > > > > > > > >>>>>
>> > > > > > > > > > >>
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> https://github.com/streamnative/kop/blob/master/kafka-impl/src/main/java/io/streamnative/pulsar/handlers/kop/stats/PrometheusMetricsProvider.java
>> > > > > > > > > > >>>>>
>> > > > > > > > > > >>>>> Thanks,
>> > > > > > > > > > >>>>> Yunze
>> > > > > > > > > > >>>>>
>> > > > > > > > > > >>>>> On Sun, May 7, 2023 at 5:53 PM Asaf Mesika <
>> > > > > > > > asaf.mes...@gmail.com>
>> > > > > > > > > > >> wrote:
>> > > > > > > > > > >>>>>>
>> > > > > > > > > > >>>>>> I'm very appreciative for feedback from multiple
>> > > pulsar
>> > > > > > users
>> > > > > > > > and
>> > > > > > > > > devs
>> > > > > > > > > > >>>>> on
>> > > > > > > > > > >>>>>> this PIP, since it has dramatic changes
>> suggested and
>> > > > > quite
>> > > > > > > > > extensive
>> > > > > > > > > > >>>>>> positive change for the users.
>> > > > > > > > > > >>>>>>
>> > > > > > > > > > >>>>>>
>> > > > > > > > > > >>>>>> On Thu, Apr 27, 2023 at 7:32 PM Asaf Mesika <
>> > > > > > > > > asaf.mes...@gmail.com>
>> > > > > > > > > > >>>>> wrote:
>> > > > > > > > > > >>>>>>
>> > > > > > > > > > >>>>>>> Hi all,
>> > > > > > > > > > >>>>>>>
>> > > > > > > > > > >>>>>>> I'm very excited to release a PIP I've been
>> working
>> > > on
>> > > > in
>> > > > > > the
>> > > > > > > > > past 11
>> > > > > > > > > > >>>>>>> months, which I think will be immensely
>> valuable to
>> > > > > Pulsar,
>> > > > > > > > > which I
>> > > > > > > > > > >>>>> like so
>> > > > > > > > > > >>>>>>> much.
>> > > > > > > > > > >>>>>>>
>> > > > > > > > > > >>>>>>> PIP:
>> https://github.com/apache/pulsar/issues/20197
>> > > > > > > > > > >>>>>>>
>> > > > > > > > > > >>>>>>> I'm quoting here the preface:
>> > > > > > > > > > >>>>>>>
>> > > > > > > > > > >>>>>>> === QUOTE START ===
>> > > > > > > > > > >>>>>>>
>> > > > > > > > > > >>>>>>> Roughly 11 months ago, I started working on
>> solving
>> > > the
>> > > > > > > biggest
>> > > > > > > > > issue
>> > > > > > > > > > >>>>> with
>> > > > > > > > > > >>>>>>> Pulsar metrics: the lack of ability to monitor a
>> > > pulsar
>> > > > > > > broker
>> > > > > > > > > with a
>> > > > > > > > > > >>>>> large
>> > > > > > > > > > >>>>>>> topic count: 10k, 100k, and future support of
>> 1M.
>> > > This
>> > > > > > > started
>> > > > > > > > by
>> > > > > > > > > > >>>>> mapping
>> > > > > > > > > > >>>>>>> the existing functionality and then enumerating
>> all
>> > > the
>> > > > > > > > problems
>> > > > > > > > > I
>> > > > > > > > > > >>>>> saw (all
>> > > > > > > > > > >>>>>>> documented in this doc
>> > > > > > > > > > >>>>>>> <
>> > > > > > > > > > >>>>>
>> > > > > > > > > > >>
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> https://docs.google.com/document/d/1vke4w1nt7EEgOvEerPEUS-Al3aqLTm9cl2wTBkKNXUA/edit?usp=sharing
>> > > > > > > > > >
>> > > > > > > > > > I thought we were going to stop using Google docs for
>> PIPs.
>> > > > > > > > > >
>> > > > > > > > > > >>>>>>
>> > > > > > > > > > >>>>>>> ).
>> > > > > > > > > > >>>>>>>
>> > > > > > > > > > >>>>>>> This PIP is a parent PIP. It aims to gradually
>> solve
>> > > > > (using
>> > > > > > > > > sub-PIPs)
>> > > > > > > > > > >>>>> all
>> > > > > > > > > > >>>>>>> the current metric system's problems and
>> provide the
>> > > > > > ability
>> > > > > > > to
>> > > > > > > > > > >>>>> monitor a
>> > > > > > > > > > >>>>>>> broker with a large topic count, which is
>> currently
>> > > > > > lacking.
>> > > > > > > > As a
>> > > > > > > > > > >>>>> parent
>> > > > > > > > > > >>>>>>> PIP, it will describe each problem and its
>> solution
>> > > at
>> > > > a
>> > > > > > high
>> > > > > > > > > level,
>> > > > > > > > > > >>>>>>> leaving fine-grained details to the sub-PIPs.
>> The
>> > > > parent
>> > > > > > PIP
>> > > > > > > > > ensures
>> > > > > > > > > > >>>>> all
>> > > > > > > > > > >>>>>>> solutions align and does not contradict each
>> other.
>> > > > > > > > > > >>>>>>>
>> > > > > > > > > > >>>>>>> The basic building block to solve the monitoring
>> > > > ability
>> > > > > of
>> > > > > > > > large
>> > > > > > > > > > >>>>> topic
>> > > > > > > > > > >>>>>>> count is aggregating internally (to topic
>> groups) and
>> > > > > > adding
>> > > > > > > > > > >>>>> fine-grained
>> > > > > > > > > > >>>>>>> filtering. We could have shoe-horned it into the
>> > > > existing
>> > > > > > > > metric
>> > > > > > > > > > >>>>> system,
>> > > > > > > > > > >>>>>>> but we thought adding that to a system already
>> > > > ingrained
>> > > > > > with
>> > > > > > > > > many
>> > > > > > > > > > >>>>> problems
>> > > > > > > > > > >>>>>>> would be wrong and hard to do gradually, as so
>> many
>> > > > > things
>> > > > > > > will
>> > > > > > > > > > >>>>> break. This
>> > > > > > > > > > >>>>>>> is why the second-biggest design decision
>> presented
>> > > > here
>> > > > > is
>> > > > > > > > > > >>>>> consolidating
>> > > > > > > > > > >>>>>>> all existing metric libraries into a single one
>> -
>> > > > > > > OpenTelemetry
>> > > > > > > > > > >>>>>>> <https://opentelemetry.io/>. The parent PIP
>> will
>> > > > explain
>> > > > > > why
>> > > > > > > > > > >>>>>>> OpenTelemetry was chosen out of existing
>> solutions
>> > > and
>> > > > > why
>> > > > > > it
>> > > > > > > > far
>> > > > > > > > > > >>>>> exceeds
>> > > > > > > > > > >>>>>>> all other options. I’ve been working closely
>> with the
>> > > > > > > > > OpenTelemetry
>> > > > > > > > > > >>>>>>> community in the past eight months:
>> brain-storming
>> > > this
>> > > > > > > > > integration,
>> > > > > > > > > > >>>>> and
>> > > > > > > > > > >>>>>>> raising issues, in an effort to remove serious
>> > > blockers
>> > > > > to
>> > > > > > > make
>> > > > > > > > > this
>> > > > > > > > > > >>>>>>> migration successful.
>> > > > > > > > > > >>>>>>>
>> > > > > > > > > > >>>>>>> I made every effort to summarize this document
>> so
>> > > that
>> > > > it
>> > > > > > can
>> > > > > > > > be
>> > > > > > > > > > >>>>> concise
>> > > > > > > > > > >>>>>>> yet clear. I understand it is an effort to read
>> it
>> > > and,
>> > > > > > more
>> > > > > > > > so,
>> > > > > > > > > > >>>>> provide
>> > > > > > > > > > >>>>>>> meaningful feedback on such a large document;
>> hence
>> > > I’m
>> > > > > > very
>> > > > > > > > > grateful
>> > > > > > > > > > >>>>> for
>> > > > > > > > > > >>>>>>> each individual who does so.
>> > > > > > > > > > >>>>>>>
>> > > > > > > > > > >>>>>>> I think this design will help improve the user
>> > > > experience
>> > > > > > > > > immensely,
>> > > > > > > > > > >>>>> so it
>> > > > > > > > > > >>>>>>> is worth the time spent reading it.
>> > > > > > > > > > >>>>>>>
>> > > > > > > > > > >>>>>>>
>> > > > > > > > > > >>>>>>> === QUOTE END ===
>> > > > > > > > > > >>>>>>>
>> > > > > > > > > > >>>>>>>
>> > > > > > > > > > >>>>>>> Thanks!
>> > > > > > > > > > >>>>>>>
>> > > > > > > > > > >>>>>>> Asaf Mesika
>> > > > > > > > > > >>>>>>>
>> > > > > > > > > > >>>>>
>> > > > > > > > > > >>>>
>> > > > > > > > > > >>
>> > > > > > > > > > >>
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>>
>

Re: [DISCUSS] PIP-264: Enhanced OTel-based metric system

Reply via email to