I've opened SPARK-3051 (https://issues.apache.org/jira/browse/SPARK-3051)
based on this thread.
Neil
On Thu, Jul 24, 2014 at 10:30 PM, Neil Ferguson wrote:
> That would work well for me! Do you think it would be necessary to specify
> which accumulators should be available in the registry, or
That would work well for me! Do you think it would be necessary to specify
which accumulators should be available in the registry, or would we just
broadcast all named accumulators registered in SparkContext and make them
available in the registry?
Anyway, I'm happy to make the necessary cha
What if we have a registry for accumulators, where you can access them
statically by name?
- Patrick
On Thu, Jul 24, 2014 at 1:51 PM, Neil Ferguson wrote:
> I realised that my last reply wasn't very clear -- let me try and clarify.
>
> The patch for named accumulators looks very useful, however
I realised that my last reply wasn't very clear -- let me try and clarify.
The patch for named accumulators looks very useful, however in Shivaram's
example he was able to retrieve the named task metrics (statically) from a
TaskMetrics object, as follows:
TaskMetrics.get("f1-time")
However, I do
Hi Patrick.
That looks very useful. The thing that seems to be missing from Shivaram's
example is the ability to access TaskMetrics statically (this is the same
problem that I am trying to solve with dynamic variables).
You mention defining an accumulator on the RDD. Perhaps I am missin
Shivaram,
You should take a look at this patch which adds support for naming
accumulators - this is likely to get merged in soon. I actually
started this patch by supporting named TaskMetrics similar to what you
have there, but then I realized there is too much semantic overlap
with accumulators,
Hi Christopher
Thanks for your reply. I'll try and address your points -- please let me
know if I missed anything.
Regarding clarifying the problem statement, let me try and do that with a
real-world example. I have a method that I want to measure the performance
of, which has the following signa
>From reading Neil's first e-mail, I think the motivation is to get some
metrics in ADAM ? -- I've run into a similar use-case with having
user-defined metrics in long-running tasks and I think a nice way to solve
this would be to have user-defined TaskMetrics.
To state my problem more clearly, l
Hi Reynold
Thanks for your reply.
Accumulators are, of course, stored in the Accumulators object as
thread-local variables. However, the Accumulators object isn't public, so
when a Task is executing there's no way to get the set of accumulators for
the current thread -- accumulators still have to
Thanks for the thoughtful email, Neil and Christopher.
If I understand this correctly, it seems like the dynamic variable is just
a variant of the accumulator (a static one since it is a global object).
Accumulators are already implemented using thread-local variables under the
hood. Am I misunder
Hi Neil, first off, I'm generally a sympathetic advocate for making changes
to Spark internals to make it easier/better/faster/more awesome.
In this case, I'm (a) not clear about what you're trying to accomplish, and
(b) a bit worried about the proposed solution.
On (a): it is stated that you wan
11 matches
Mail list logo