Hey Ryan, Thank you for reading through the description. I published the FLIP in hope it will provides a better understanding of the proposal [1]
> Is the issue you are trying to resolve that we don't know if a Counter is > monotonic? Yes this is the main concern of the FLIP, distinguishing monotonicity so backend systems can tell that there was integer overflow or system restart during a given time and are able to smooth it out in graphs. If Counter.dec was widely used in flink we were having a different discussion, but since it was only historically introduced (Dropwizard compatibility?) and never used within our codebase, I believe declaring monotonicity is worth deprecating Counter.dec (+ providing the UpDownCounter alternative) > Prometheus casts it to a Gauge because that code path hasn't been upgraded in > years Prometheus' latest java client still throws[2] if you try to decrease it. This is by design, and the reason flink counters are mapped to prom gauges. You didn't share more about the internal implementation but if it uses prometheus counters, I'd assume negative deltas are ignored as in OTel reporter today This FLIP can actually pave the way for it to be contributed- with true monotonicity, Flink counter will correctly map to prometheus Counter (please check out the "proposed change" section of the FLIP) Let me know what you think. Efrat [1] https://cwiki.apache.org/confluence/x/twDuGQ [2] https://github.com/prometheus/client_java/blob/main/prometheus-metrics-core/src/main/java/io/prometheus/metrics/core/metrics/Counter.java#L189 On Mon, 22 Jun 2026 at 17:23, Ryan van Huuksloot via dev <[email protected]> wrote: > > Hi Efrat, > > I'd like to clarify the intention of the FLIP, as it seems like we are > introducing multiple issues. > > Is the issue you are trying to resolve that we don't know if a Counter is > monotonic? > > After reading the links and the FLIP description, it sounds like we need to > fix the OTEL reporter. I don't immediately see any major reason to overhaul > the entire counter system. The current Counter can go up and down, but the > OTEL reporter didn't respect that. > > I also wanted to point out that Prometheus casts it to a Gauge because that > code path hasn't been upgraded in years. It should use the Native Counters > that now exist in Prometheus, we just haven't done the work to > upgrade/migrate. We have an internal implementation to move Flink Counters > to Prometheus Native Counters. > > Is there something I am missing? > > Ryan van Huuksloot > Staff Engineer, Infrastructure | Streaming Platform > [image: Shopify] > <https://www.shopify.com/?utm_medium=salessignatures&utm_source=hs_email> > > > On Mon, Jun 15, 2026 at 2:07 PM Efrat Levitan <[email protected]> wrote: > > > Hi all, > > I would like to start a discussion about how to differentiate > > monotonic and non monotonic counters in flink metrics. > > > > Monotonic (ever-increasing) Counters can benefit from automatic reset > > detection on the monitoring system side (when the value drops we can > > safely assume the process was reset and synthetically adjust the > > value) > > > > Historically, flink Counter can be decremented / incremented by a > > non-positive value, but this has almost never been used intentionally > > across the flink codebase, i.e if a counter got decremented this is > > usually a bug [1]. > > So though system-emitted counters are effectively monotonic, exporters > > must respect org.apache.flink.metrics.Counter contract and assume > > non-monotonicity. > > > > I'd like to propose deprecating org.apache.flink.metrics.Counter#dec > > in favor of a new UpDownCounter implementation. This matches modern > > metric APIs like OTel, where the regular Counter is monotonic[2] and > > an additional UpDownCounter supports[3] non-positive additions. > > > > While it seems to be the cleanest approach, we could still avoid the > > deprecation by introducing a MonotonicCounter and have all flink > > counters migrated, or expand the Counter interface to declare > > monotonicity (based on the implementation). > > > > Recognising monotonicity will also align counters reporting across > > monitoring systems. Today, for instance, Otel reporter drops[4] > > non-incremental data points with a warning, while Prometheus reporter > > casts[5] them as Gauges. > > > > I'm looking forward to your feedback > > Efrat > > > > [1] https://issues.apache.org/jira/browse/FLINK-39892 > > [2] > > https://github.com/open-telemetry/opentelemetry-java/blob/main/api/all/src/main/java/io/opentelemetry/api/metrics/LongCounter.java#L40 > > [3] > > https://github.com/open-telemetry/opentelemetry-java/blob/main/api/all/src/main/java/io/opentelemetry/api/metrics/LongUpDownCounter.java > > [4] https://issues.apache.org/jira/browse/FLINK-39893 > > [5] > > https://github.com/apache/flink/blob/master/flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/AbstractPrometheusReporter.java#L177-L184 > >
