Just curious - is this HttpSink your own custom sink or Dropwizard configuration?
If your own custom code, I would suggest looking/trying out the Dropwizard. See http://spark.apache.org/docs/latest/monitoring.html#metrics https://metrics.dropwizard.io/4.0.0/ Also, from what I know, the metrics from the tasks/executors are sent as accumulator values to the driver and the driver makes it available to the desired sink. Furthermore, even without a custom HttpSink, there's already a builtin REST API available that provides you metrics See http://spark.apache.org/docs/latest/monitoring.html#rest-api While you can surely create your own custom sink (code), I would say try out custom configuration first as it will make Spark upgrades easy. On 12/20/18, 3:53 PM, "Marcelo Vanzin" <van...@cloudera.com.INVALID> wrote: First, it's really weird to use "org.apache.spark" for a class that is not in Spark. For executors, the jar file of the sink needs to be in the system classpath; the application jar is not in the system classpath, so that does not work. There are different ways for you to get it there, most of them manual (YARN is, I think, the only RM supported in Spark where the application itself can do it). On Thu, Dec 20, 2018 at 1:48 PM prosp4300 <prosp4...@163.com> wrote: > > Hi, Spark Users > > I'm play with spark metric monitoring, and want to add a custom sink which is HttpSink that send the metric through Restful API > A subclass of Sink "org.apache.spark.metrics.sink.HttpSink" is created and packaged within application jar > > It works for driver instance, but once enabled for executor instance, following ClassNotFoundException will be throw out. This seems due to MetricSystem is started very early for executor before application jar is loaded. > > I wonder is there any way or best practice to add custom sink for executor instance? > > 18/12/21 04:58:32 ERROR MetricsSystem: Sink class org.apache.spark.metrics.sink.HttpSink cannot be instantiated > 18/12/21 04:58:32 WARN UserGroupInformation: PriviledgedActionException as:yarn (auth:SIMPLE) cause:java.lang.ClassNotFoundException: org.apache.spark.metrics.sink.HttpSink > Exception in thread "main" java.lang.reflect.UndeclaredThrowableException > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1933) > at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66) > at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188) > at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:284) > at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) > Caused by: java.lang.ClassNotFoundException: org.apache.spark.metrics.sink.HttpSink > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at org.apache.spark.util.Utils$.classForName(Utils.scala:230) > at org.apache.spark.metrics.MetricsSystem$$anonfun$registerSinks$1.apply(MetricsSystem.scala:198) > at org.apache.spark.metrics.MetricsSystem$$anonfun$registerSinks$1.apply(MetricsSystem.scala:194) > at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) > at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99) > at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) > at scala.collection.mutable.HashMap.foreach(HashMap.scala:99) > at org.apache.spark.metrics.MetricsSystem.registerSinks(MetricsSystem.scala:194) > at org.apache.spark.metrics.MetricsSystem.start(MetricsSystem.scala:102) > at org.apache.spark.SparkEnv$.create(SparkEnv.scala:366) > at org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:201) > at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:223) > at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:67) > at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:66) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) > ... 4 more > stdout0,*container_e81_1541584460930_3814_01_000005� > spark.log36118/12/21 04:58:00 ERROR org.apache.spark.metrics.MetricsSystem.logError:70 - Sink class org.apache.spark.metrics.sink.HttpSink cannot be instantiated > > > > -- Marcelo