Re: Running multiple foreach loops

Ted Yu Wed, 17 Feb 2016 14:13:18 -0800

If the Accumulators are updated at the same time, calling foreach() once seems 
to have better performance.


> On Feb 17, 2016, at 4:30 PM, Daniel Imberman <daniel.imber...@gmail.com> 
> wrote:
> 
> Hi all,
> 
> So I'm currently figuring out how to accumulate three separate accumulators:
> 
> val a:Accumulator
> val b:Accumulator
> val c:Accumulator
> 
> I have an r:RDD[thing] and the code currently reads
> 
> r.foreach{
>    thing =>
>             a += thing
>             b += thing
>             c += thing
> }
> 
> 
> Ideally, I would much prefer to split this up so that I can separate
> concerns. I'm considering creating something along the lines of:
> 
> def handleA(a:Accumulator, r:RDD[Thing]){
> //a's logic
> r.foreach{ thing => a += thing}
> }
> 
> 
> def handleB(b:Accumulator, r:RDD[Thing]){
> //a's logic
> r.foreach{ thing => b += thing}
> }
> 
> and so on. However Im worried that this would cause a performance hit. Does
> anyone have any thoughts as to whether this would be a bad idea?
> 
> thank you!
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Running-multiple-foreach-loops-tp26256.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Running multiple foreach loops

Reply via email to