Given the lazy nature of an RDD if you use an accumulator inside a map() and then you call count and saveAsTextfile over that accumulator will be called twice. IMHO, accumulators are a bit nondeterministic you need to be sure when to read them to avoid unexpected re-executions El 3/5/2015 2:09 p. m., "xiazhuchang" <hk8...@163.com> escribió:
> The official document said " In transformations, users should be aware of > that each task’s update may be applied more than once if tasks or job > stages > are re-executed." > I don't quite understand what is this mean. is that meas if i use the > accumulator in transformations(i.e. map() operation), this operation will > be > execuated more than once if the task restarte? And then the final result > will be many times of the real result? > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Questions-about-Accumulators-tp22746.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >