Hi Nik, Can you have a look at this JIRA ticket [1] and check if it is related to the problems your are facing? If so, would you mind leaving a comment there?
Thank you, Fabian [1] https://issues.apache.org/jira/browse/FLINK-8946 2018-05-31 4:41 GMT+02:00 Nikolas Davis <nda...@newrelic.com>: > We keep track of metrics by using the value of > MetricGroup::getMetricIdentifier, > which returns the fully qualified metric name. The query that we use to > monitor metrics filters for metrics IDs that match '%Status.JVM.Memory%'. > As long as the new metrics come online via the MetricReporter interface > then I think the chart would be continuous; we would just see the old JVM > memory metrics cycle into new metrics. > > Nik Davis > Software Engineer > New Relic > > On Wed, May 30, 2018 at 5:30 PM, Ajay Tripathy <aj...@yelp.com> wrote: > >> How are your metrics dimensionalized/named? Task managers often have UIDs >> generated for them. The task id dimension will change on restart. If you >> name your metric based on this 'task_id' there would be a discontinuity >> with the old metric. >> >> On Wed, May 30, 2018 at 4:49 PM, Nikolas Davis <nda...@newrelic.com> >> wrote: >> >>> Howdy, >>> >>> We are seeing our task manager JVM metrics disappear over time. This >>> last time we correlated it to our job crashing and restarting. I wasn't >>> able to grab the failing exception to share. Any thoughts? >>> >>> We track metrics through the MetricReporter interface. As far as I can >>> tell this more or less only affects the JVM metrics. I.e. most / all other >>> metrics continue reporting fine as the job is automatically restarted. >>> >>> Nik Davis >>> Software Engineer >>> New Relic >>> >> >> >