I'd like this to happen, but it hasn't been super high priority on anybody's mind.
There are a couple things that could be good to do: 1. At the application level: consolidate task metrics and accumulators. They have substantial overlap, and from high level should just be consolidated. Maybe there are some differences in semantics w.r.t. retries or fault-tolerance, but those can be just modes in the consolidated interface/implementation. Once we do that, then users effectively can use the new consolidated interface to add new metrics. 2. At the process/service monitoring level: expose an internal metrics interface to make it easier to create new metrics and publish them via a rest interface. Last time I looked at this (~4 weeks ago), publication of the current metrics was not as straightforward as I was hoping for. We use the codahale library only in some places (IIRC just the cluster manager, but not the actual executors). It'd make sense to create a simple wrapper for the coda hale library and make it easier to create new metrics. On Thu, Aug 27, 2015 at 12:21 PM, Atsu Kakitani <atkakit...@groupon.com> wrote: > Hi, > > I was wondering if there are any plans to open up the API for Spark's > metrics system. I want to write custom sources and sinks, but these > interfaces aren't public right now. I saw that there was also an issue open > for this (https://issues.apache.org/jira/browse/SPARK-5630), but it > hasn't been addressed - is there a reason why these interfaces are kept > private? > > Thanks, > Atsu >