Re: Questions about Accumulators

Ignacio Blasco Sun, 03 May 2015 05:35:07 -0700

Given the lazy nature of an RDD if you use an accumulator inside a map()
and then you call count and  saveAsTextfile over that accumulator will be
called twice. IMHO, accumulators are a bit nondeterministic you need to be
sure when to read them to avoid unexpected re-executions
El 3/5/2015 2:09 p. m., "xiazhuchang" <hk8...@163.com> escribió:


> The official document said " In transformations, users should be aware of
> that each task’s update may be applied more than once if tasks or job
> stages
> are re-executed."
> I don't quite understand what is this mean. is that meas if i use the
> accumulator in transformations(i.e. map() operation), this operation will
> be
> execuated more than once if the task restarte? And then the final result
> will be many times of the real result?
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Questions-about-Accumulators-tp22746.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: Questions about Accumulators

Reply via email to