[ 
https://issues.apache.org/jira/browse/FLINK-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15810263#comment-15810263
 ] 

Shannon Carey commented on FLINK-5425:
--------------------------------------

I am using the statsd reporter and the data thereby flows to Graphite. You can 
see that the filter character method of the statsd reporter does not filter 
".": 
https://github.com/apache/flink/blob/master/flink-metrics/flink-metrics-statsd/src/main/java/org/apache/flink/metrics/statsd/StatsDReporter.java#L199

So, yes, this could be fixed in the reporter, and I could make a PR with that 
change... but it would impact dots in every part of the name, as you mention. 
While that might make sense for people like me who are using the Graphite 
backend (though it would change how our job names appear in the metrics since 
those contain periods), I'm not sure it makes sense for people who use other 
backends. Given the uncertainty, perhaps it would be better to add a 
configuration parameter which allows the user to control what characters get 
filtered out? A regex perhaps?

The reporter isn't really broken, it's just that the metric naming is 
inconsistent. To me, the simplest solution is to eliminate the difference 
behavior between identifying jobmanager by IP in the metrics vs. identifying 
taskmanager by hostname.

This problem is definitely present in 1.1.3 (that's where I'm seeing it in my 
live systems), but you're right the link I put in the description was to the 
then-current master.

I'm happy to submit a PR once the implementation approach has been decided... 
although I may need a little guidance about how to go about adjusting the 
TaskManagerLocation logic so JobManagerRunner can share it, if that's what gets 
decided.

> JobManager <host> replaced by IP in metrics
> -------------------------------------------
>
>                 Key: FLINK-5425
>                 URL: https://issues.apache.org/jira/browse/FLINK-5425
>             Project: Flink
>          Issue Type: Bug
>          Components: Metrics
>    Affects Versions: 1.1.3
>            Reporter: Shannon Carey
>            Priority: Minor
>
> In metrics at the jobmanager level and below, the "<host>" scope variable is 
> being replaced by the IP rather than the hostname. The taskmanager metrics, 
> meanwhile, use the host name.
> You can see the job manager behavior at 
> https://github.com/apache/flink/blob/a1934255421b97eefd579183e9c7199c43ad1a2c/flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/JobManagerRunner.java#L147
>  compared to TaskManagerLocation#getHostname().
> The problem with this is mainly that due to the presence of "." (period) 
> characters in the IP address and thereby the metric name, the metric names 
> show up strangely in Graphite/Grafana, where "." is the metric group 
> separator.
> If it's not possible to make jobmanager metrics use the hostname, I suggest 
> replacing "." with "-" in the <host> section.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to