foreachPartition is an action but run on each worker, which means you won't see anything on driver.
mapPartitions is a transformation which is lazy and won't do anything until an action. it depends on the specific use case which is better. To output sth(like a print in single machine) you could refer to take, collect, foreach, etc.. On Mon, Mar 20, 2017 at 2:20 PM, Diwakar Dhanuskodi < diwakar.dhanusk...@gmail.com> wrote: > Just wanted to clarify!!! > > Is foreachPartition in spark an output operation? > > Which one is better use mapPartitions or foreachPartitions? > > Regards > Diwakar >