“For accumulator updates performed inside actions only, Spark guarantees that each task’s update to the accumulator will only be applied once, i.e. restarted tasks will not update the value. In transformations, users should be aware of that each task’s update may be applied more than once if tasks or job stages are re-executed. ” Is this mean the guarantees(accumulator only be updated once) only in actions? That is to say, one should use the accumulator only in actions, orelse there may be some errors(update more than once) if used in transformations? e.g. map(x => accumulator += x) After executed, the correct result of accumulator should be "1"; Unfortunately, some errors happened, restart task, the map() operation re-executed(map(x => accumulator += x) re-executed), then the final result of acculumator will be "2", twice as the correct result?
-- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Questions-about-Accumulators-tp22746p22747.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org