I'm writing a metrics sink and reporter to push metrics to Elasticsearch.
An example format of a metric in JSON:

{
 "timestamp": "2016-03-15T16:11:19.314+0000",
 "hostName": "10.192.0.87"
 "applicationName": "My application",
 "applicationId": "app-20160315093931-0003",
 "executorId": "17",
 "executor_threadpool_completeTasks": 20
}

For correlating the metrics I want the timestamp, hostname, applicationId,
executorId and applicationName.

Currently I am extracting the applicationId and executor Id from the metric
name as MetricsSystem prepends these to the name. As the sink is
instantiated without the SparkConf I can not determine the applicationName.

Another proposed change in https://issues.apache.org/jira/browse/SPARK-10610
would also make me require access to the SparkConf to get the
applicationId/executorId.

So, Is the SparkConf a singleton and can there be a Utils method for
accessing it? Instantiating a SparkConf myself will not pick up the appName
etc as these are set via methods on the conf.

I'm trying to write this without modifying any Spark code by just using a
definition in the metrics properties to load my sink.

Cheers,

Reply via email to