On May 9, 2016 8:24:06 PM EDT, Abi wrote:
>I am splitting an integer array in 2 partitions and using an
>accumulator to sum the array. problem is
>
>1. I am not seeing execution time becoming half of a linear summing.
>
>2. The second node (from looking at timestamps) takes 3 times as long
>as
Your mail does not describe much , but wont a simple reduce function help
you ?
Something like as below
val data = Seq(1,2,3,4,5,6,7)
val rdd = sc.parallelize(data, 2)
val sum = rdd.reduce((a,b) => a+b)
Regards,
Rishitesh Mishra,
SnappyData . (http://www.snappydata.io/)
https://in.linkedin.co
I am splitting an integer array in 2 partitions and using an accumulator to
sum the array. problem is
1. I am not seeing execution time becoming half of a linear summing.
2. The second node (from looking at timestamps) takes 3 times as long as
the first node. This gives the impression it is "wait
I am splitting an integer array in 2 partitions and using an accumulator to
sum the array. problem is
1. I am not seeing execution time becoming half of a linear summing.
2. The second node (from looking at timestamps) takes 3 times as long as the
first node. This gives the impression it is "
I"m not sure if it's an exact match, or just very close :-)
I don't think our problem is the workload on the driver, I think it's just
memory - so while the solution proposed there would work, it would also be
sufficient for our purposes, I believe, simply to clear each block as soon
as it's added
Hi Nathan,
It sounds like what you're asking for has already been filed as
https://issues.apache.org/jira/browse/SPARK-664 Does that ticket match
what you're proposing?
Andrew
On Fri, Nov 21, 2014 at 12:29 PM, Nathan Kronenfeld <
nkronenf...@oculusinfo.com> wrote:
> We've done this with reduce
We've done this with reduce - that definitely works.
I've reworked the logic to use accumulators because, when it works, it's
5-10x faster
On Fri, Nov 21, 2014 at 4:44 AM, Sean Owen wrote:
> This sounds more like a use case for reduce? or fold? it sounds like
> you're kind of cobbling together
This sounds more like a use case for reduce? or fold? it sounds like
you're kind of cobbling together the same function on accumulators,
when reduce/fold are simpler and have the behavior you suggest.
On Fri, Nov 21, 2014 at 5:46 AM, Nathan Kronenfeld
wrote:
> I think I understand what is going o
I think I understand what is going on here, but I was hoping someone could
confirm (or explain reality if I don't) what I'm seeing.
We are collecting data using a rather sizable accumulator - essentially, an
array of tens of thousands of entries. All told, about 1.3m of data.
If I understand thi
I notice that accumulators register themselves with a private Accumulators
object.
I don't notice any way to unregister them when one is done.
Am I missing something? If not, is there any plan for how to free up that
memory?
I've a case where we're gathering data from repeated queries using some
10 matches
Mail list logo