Cody said "If you don't care about the value that your map produced (because you're not already collecting or saving it), then is foreach more appropriate to what you're doing?" but I can not see it from this thread. Anyway, I performed small benchmark to test what function is the most efficient way. And a winner is foreach(a => a) according to everyone's expectations. Collect can cause OOM from driver and count is very slower than the others. Thanks all.
-- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Manually-trigger-RDD-map-function-without-action-tp21094p21110.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org