Hi all, If you can wait for Flink 1.16, there is a new feature to filter metrics (includes/excludes filter). Additionally, you can already take advantage of dropping unnecessary labels with `scope.variables.excludes` in the current release. Link to 1.16 metric features: https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/metric_reporters/#reporter
Best, Mason On Fri, Jul 1, 2022 at 3:55 AM Martijn Visser <martijnvis...@apache.org> wrote: > Have you considered setting the value for some of the series to a fixed > value? For example, if you're not interested in the value for <task_id>, > you could consider setting that to a fixed value "task_id" [1] ? > > Best regards, > > Martijn > > [1] > https://nightlies.apache.org/flink/flink-docs-master/docs/ops/metrics/#system-scope > > Op do 30 jun. 2022 om 15:52 schreef Weihua Hu <huweihua....@gmail.com>: > >> Hi, Filip >> >> You can modify the InfluxdbReporter code to rewrite the >> notifyOfAddedMetric method and filter the required metrics for reporting. >> >> Best, >> Weihua >> >> >> On Thu, Jun 30, 2022 at 8:46 PM Filip Karnicki <filip.karni...@gmail.com> >> wrote: >> >>> Hi All >>> >>> We're using the influx reporter (flink 1.14.3), which seems to create a >>> series per: >>> -[task|job]manager >>> - host >>> - job_id >>> - job_name >>> - subtask_index >>> - task_attempt_id >>> - task_attempt_num >>> - task_id >>> - tm_id >>> >>> which amounts to about 4k of series each time our job restarts itself >>> >>> We are currently experiencing problems with checkpoint duration timeouts >>> (> 60s) (unrelated) and every 60 secs our job restarts and creates further >>> 4k series in influxdb. >>> >>> Needless to say, the team managing influxdb is not too happy with the >>> amount of series we create. >>> >>> Is there anything I can do to either reduce the number of series, or >>> reduce the number of types of metrics in order to produce fewer series? (we >>> don't view all the available metrics in grafana, so we don't necessarily >>> have to send all of them) >>> >>> The db caps at 1M series, and with our current problems with >>> checkpointing we go through that many in a matter of hours >>> >>> Many thanks >>> Fil >>> >>>