Re: "Dynamic variables" in Spark

2014-08-15 Thread Neil Ferguson
I've opened SPARK-3051 (https://issues.apache.org/jira/browse/SPARK-3051) based on this thread. Neil On Thu, Jul 24, 2014 at 10:30 PM, Neil Ferguson wrote: > That would work well for me! Do you think it would be necessary to specify > which accumulators should be available in the r

Re: "Dynamic variables" in Spark

2014-07-24 Thread Neil Ferguson
ssary changes (unless someone else wants to). On Thu, Jul 24, 2014 at 10:17 PM, Patrick Wendell wrote: > What if we have a registry for accumulators, where you can access them > statically by name? > - Patrick > On Thu, Jul 24, 2014 at 1:51 PM, Neil Ferguson wrote: >> I real

Re: "Dynamic variables" in Spark

2014-07-24 Thread Neil Ferguson
when executing each task. [1] https://github.com/apache/spark/pull/1498 On Wed, Jul 23, 2014 at 8:30 AM, Neil Ferguson wrote: > Hi Patrick. > > That looks very useful. The thing that seems to be missing from Shivaram's > example is the ability to access TaskMetrics st

Re: "Dynamic variables" in Spark

2014-07-23 Thread Neil Ferguson
to aggregate task metrics > across a stage etc. So it would be great if we could also populate these in > the UI and show median/max etc. > I think counters [1] in Hadoop served a similar purpose. > > Thanks > Shivaram > > [1] > https://www.inkling.com/read/hadoop-d

Re: "Dynamic variables" in Spark

2014-07-22 Thread Neil Ferguson
that > the benefits are significant enough to warrant the costs. Do I > misunderstand that the benefit is to save one explicit parameter (the > "context") in the signature/closure code? > > -- > Christopher T. Nguyen > Co-founder & CEO, Adatao <http://adatao.com>

Re: "Dynamic variables" in Spark

2014-07-22 Thread Neil Ferguson
t > > the benefits are significant enough to warrant the costs. Do I > > misunderstand that the benefit is to save one explicit parameter (the > > "context") in the signature/closure code? > > > > -- > > Christopher T. Nguyen > > Co-founder & CEO, Ad

"Dynamic variables" in Spark

2014-07-21 Thread Neil Ferguson
Hi all I have been adding some metrics to the ADAM project https://github.com/bigdatagenomics/adam, which runs on Spark, and have a proposal for an enhancement to Spark that would make this work cleaner and easier. I need to pass some Accumulators around, which will aggregate metrics (timing stat