Using StatCounter as an example, I'd like to understand if "pure" functional
implementation would be more or less beneficial for "accumulating"
structures used inside RDD.map 

StatCounter.merge is updating mutable class variables and returning
reference to same object. This is clearly a non-functional implementation
and it mutates existing state of the instance. (Unless I'm missing
something)

Would it be preferable to have all the class variables declared as val and
create new instance to hold merged values?

The StatCounter would be used inside the RDD.map to collect stats on the
fly. 
Would mutable state present bottleneck?

Can anybody comment on why non-functional implementation has been chosen? 


  



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/mutable-vs-pure-functional-implementation-StatCounter-tp23441.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to