If the Accumulators are updated at the same time, calling foreach() once seems to have better performance.
> On Feb 17, 2016, at 4:30 PM, Daniel Imberman <daniel.imber...@gmail.com> > wrote: > > Hi all, > > So I'm currently figuring out how to accumulate three separate accumulators: > > val a:Accumulator > val b:Accumulator > val c:Accumulator > > I have an r:RDD[thing] and the code currently reads > > r.foreach{ > thing => > a += thing > b += thing > c += thing > } > > > Ideally, I would much prefer to split this up so that I can separate > concerns. I'm considering creating something along the lines of: > > def handleA(a:Accumulator, r:RDD[Thing]){ > //a's logic > r.foreach{ thing => a += thing} > } > > > def handleB(b:Accumulator, r:RDD[Thing]){ > //a's logic > r.foreach{ thing => b += thing} > } > > and so on. However Im worried that this would cause a performance hit. Does > anyone have any thoughts as to whether this would be a bad idea? > > thank you! > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Running-multiple-foreach-loops-tp26256.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org