[ https://issues.apache.org/jira/browse/FLINK-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16310618#comment-16310618 ]
Wei-Che Wei commented on FLINK-7935: ------------------------------------ Hi [~elevy] What you described is almost correct. The FLINK-7692 provides users to expose their own variables to {{MetricGroup}}, but how to map the metric name and metric's variables to the third party metric system is the reporter's responsibility. You can use {{MetricGroup#getAllVariables()}} to get {{type:messageType}} and other system scope variables. These can map to tags in DataDog reporter. {{AbstractMetricGroup#getLogicalScope(CharacterFilter)}} can get {{<parent.logical.scope>.messages.type}} back, so use this function to expose metric name, which will be {{<parent.logical.scope>.messages.type.counts}}. For example, Prometheus reporter use it to expose metric name. [[1|https://github.com/apache/flink/blob/beb11976fe63c20a5dc9f22ea713c05b4d5e9585/flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/PrometheusReporter.java#L217]] However, {{MetricGroup#getMetricIdentifier(String)}} will still return {{<parent.identifier>.messages.type.<messageType>}}. It seems that DataDog reporter used this function to get metric name. [[2|https://github.com/apache/flink/blob/master/flink-metrics/flink-metrics-datadog/src/main/java/org/apache/flink/metrics/datadog/DatadogHttpReporter.java#L63]] I think that is the limitation in DataDog reporter, maybe we can make {{AbstractMetricGroup#getLogicalScope(CharacterFilter)}} as a public API, and update DataDog reporter. cc [~Zentol] Do you have any suggestions and comments? If I make any mistake on my comment, please correct me. Thank you. > Metrics with user supplied scope variables > ------------------------------------------ > > Key: FLINK-7935 > URL: https://issues.apache.org/jira/browse/FLINK-7935 > Project: Flink > Issue Type: Improvement > Components: Metrics > Affects Versions: 1.3.2 > Reporter: Elias Levy > > We use DataDog for metrics. DD and Flink differ somewhat in how they track > metrics. > Flink names and scopes metrics together, at least by default. E.g. by default > the System scope for operator metrics is > {{<host>.taskmanager.<tm_id>.<job_name>.<operator_name>.<subtask_index>}}. > The scope variables become part of the metric's full name. > In DD the metric would be named something generic, e.g. > {{taskmanager.job.operator}}, and they would be distinguished by their tag > values, e.g. {{tm_id=foo}}, {{job_name=var}}, {{operator_name=baz}}. > Flink allows you to configure the format string for system scopes, so it is > possible to set the operator scope format to {{taskmanager.job.operator}}. > We do this for all scopes: > {code} > metrics.scope.jm: jobmanager > metrics.scope.jm.job: jobmanager.job > metrics.scope.tm: taskmanager > metrics.scope.tm.job: taskmanager.job > metrics.scope.task: taskmanager.job.task > metrics.scope.operator: taskmanager.job.operator > {code} > This seems to work. The DataDog Flink metric's plugin submits all scope > variables as tags, even if they are not used within the scope format. And it > appears internally this does not lead to metrics conflicting with each other. > We would like to extend this to user defined metrics, but you can define > variables/scopes when adding a metric group or metric with the user API, so > that in DD we have a single metric with a tag with many different values, > rather than hundreds of metrics to just the one value we want to measure > across different event types. -- This message was sent by Atlassian JIRA (v6.4.14#64029)