Just curious - is this HttpSink your own custom sink or Dropwizard 
configuration?

If your own custom code, I would suggest looking/trying out the Dropwizard.
See 
http://spark.apache.org/docs/latest/monitoring.html#metrics
https://metrics.dropwizard.io/4.0.0/

Also, from what I know, the metrics from the tasks/executors are sent as 
accumulator values to the driver and the driver makes it available to the 
desired sink.

Furthermore, even without a custom HttpSink, there's already a builtin REST API 
available that provides you metrics
See http://spark.apache.org/docs/latest/monitoring.html#rest-api

While you can surely create your own custom sink (code), I would say try out 
custom configuration first as it will make Spark upgrades easy.

On 12/20/18, 3:53 PM, "Marcelo Vanzin" <van...@cloudera.com.INVALID> wrote:

    First, it's really weird to use "org.apache.spark" for a class that is
    not in Spark.
    
    For executors, the jar file of the sink needs to be in the system
    classpath; the application jar is not in the system classpath, so that
    does not work. There are different ways for you to get it there, most
    of them manual (YARN is, I think, the only RM supported in Spark where
    the application itself can do it).
    
    On Thu, Dec 20, 2018 at 1:48 PM prosp4300 <prosp4...@163.com> wrote:
    >
    > Hi, Spark Users
    >
    > I'm play with spark metric monitoring, and want to add a custom sink 
which is HttpSink that send the metric through Restful API
    > A subclass of Sink "org.apache.spark.metrics.sink.HttpSink" is created 
and packaged within application jar
    >
    > It works for driver instance, but once enabled for executor instance, 
following ClassNotFoundException will be throw out. This seems due to 
MetricSystem is started very early for executor before application jar is 
loaded.
    >
    > I wonder is there any way or best practice to add custom sink for 
executor instance?
    >
    > 18/12/21 04:58:32 ERROR MetricsSystem: Sink class 
org.apache.spark.metrics.sink.HttpSink cannot be instantiated
    > 18/12/21 04:58:32 WARN UserGroupInformation: PriviledgedActionException 
as:yarn (auth:SIMPLE) cause:java.lang.ClassNotFoundException: 
org.apache.spark.metrics.sink.HttpSink
    > Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
    > at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1933)
    > at 
org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
    > at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)
    > at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:284)
    > at 
org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
    > Caused by: java.lang.ClassNotFoundException: 
org.apache.spark.metrics.sink.HttpSink
    > at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    > at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
    > at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    > at java.lang.Class.forName0(Native Method)
    > at java.lang.Class.forName(Class.java:348)
    > at org.apache.spark.util.Utils$.classForName(Utils.scala:230)
    > at 
org.apache.spark.metrics.MetricsSystem$$anonfun$registerSinks$1.apply(MetricsSystem.scala:198)
    > at 
org.apache.spark.metrics.MetricsSystem$$anonfun$registerSinks$1.apply(MetricsSystem.scala:194)
    > at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
    > at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
    > at 
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
    > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
    > at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
    > at 
org.apache.spark.metrics.MetricsSystem.registerSinks(MetricsSystem.scala:194)
    > at org.apache.spark.metrics.MetricsSystem.start(MetricsSystem.scala:102)
    > at org.apache.spark.SparkEnv$.create(SparkEnv.scala:366)
    > at org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:201)
    > at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:223)
    > at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:67)
    > at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:66)
    > at java.security.AccessController.doPrivileged(Native Method)
    > at javax.security.auth.Subject.doAs(Subject.java:422)
    > at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
    > ... 4 more
    > stdout0,*container_e81_1541584460930_3814_01_000005�
    > spark.log36118/12/21 04:58:00 ERROR 
org.apache.spark.metrics.MetricsSystem.logError:70 - Sink class 
org.apache.spark.metrics.sink.HttpSink cannot be instantiated
    >
    >
    >
    >
    
    
    
    -- 
    Marcelo
    
    

Reply via email to