I've opened SPARK-3051 (https://issues.apache.org/jira/browse/SPARK-3051)
based on this thread.
Neil
On Thu, Jul 24, 2014 at 10:30 PM, Neil Ferguson wrote:
> That would work well for me! Do you think it would be necessary to specify
> which accumulators should be available in the r
ssary changes (unless someone else wants to).
On Thu, Jul 24, 2014 at 10:17 PM, Patrick Wendell
wrote:
> What if we have a registry for accumulators, where you can access them
> statically by name?
> - Patrick
> On Thu, Jul 24, 2014 at 1:51 PM, Neil Ferguson wrote:
>> I real
when executing each task.
[1] https://github.com/apache/spark/pull/1498
On Wed, Jul 23, 2014 at 8:30 AM, Neil Ferguson wrote:
> Hi Patrick.
>
> That looks very useful. The thing that seems to be missing from Shivaram's
> example is the ability to access TaskMetrics st
to aggregate task metrics
> across a stage etc. So it would be great if we could also populate these in
> the UI and show median/max etc.
> I think counters [1] in Hadoop served a similar purpose.
>
> Thanks
> Shivaram
>
> [1]
> https://www.inkling.com/read/hadoop-d
that
> the benefits are significant enough to warrant the costs. Do I
> misunderstand that the benefit is to save one explicit parameter (the
> "context") in the signature/closure code?
>
> --
> Christopher T. Nguyen
> Co-founder & CEO, Adatao <http://adatao.com>
t
> > the benefits are significant enough to warrant the costs. Do I
> > misunderstand that the benefit is to save one explicit parameter (the
> > "context") in the signature/closure code?
> >
> > --
> > Christopher T. Nguyen
> > Co-founder & CEO, Ad
Hi all
I have been adding some metrics to the ADAM project
https://github.com/bigdatagenomics/adam, which runs on Spark, and have a
proposal for an enhancement to Spark that would make this work cleaner and
easier.
I need to pass some Accumulators around, which will aggregate metrics
(timing stat