I'd like this to happen, but it hasn't been super high priority on
anybody's mind.

There are a couple things that could be good to do:

1. At the application level: consolidate task metrics and accumulators.
They have substantial overlap, and from high level should just be
consolidated. Maybe there are some differences in semantics w.r.t. retries
or fault-tolerance, but those can be just modes in the consolidated
interface/implementation.

Once we do that, then users effectively can use the new consolidated
interface to add new metrics.

2. At the process/service monitoring level: expose an internal metrics
interface to make it easier to create new metrics and publish them via a
rest interface. Last time I looked at this (~4 weeks ago), publication of
the current metrics was not as straightforward as I was hoping for. We use
the codahale library only in some places (IIRC just the cluster manager,
but not the actual executors). It'd make sense to create a simple wrapper
for the coda hale library and make it easier to create new metrics.


On Thu, Aug 27, 2015 at 12:21 PM, Atsu Kakitani <atkakit...@groupon.com>
wrote:

> Hi,
>
> I was wondering if there are any plans to open up the API for Spark's
> metrics system. I want to write custom sources and sinks, but these
> interfaces aren't public right now. I saw that there was also an issue open
> for this (https://issues.apache.org/jira/browse/SPARK-5630), but it
> hasn't been addressed - is there a reason why these interfaces are kept
> private?
>
> Thanks,
> Atsu
>

Reply via email to