I think one of the primary cases where mapPartitions is useful if you are going to be doing any setup work that can be re-used between processing each element, this way the setup work only needs to be done once per partition (for example creating an instance of jodatime).
Both map and mapPartitions are implemented using the MapPartitionsRDD. In general if your logic is easily expressed with map, and there isn't any setup work you are doing that could be shared, using map instead of map partitions tends to result in more readable code which is valuable in and off its self. On Tue, Jun 23, 2015 at 4:57 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> wrote: > I know when to use a map () but when should i use mapPartitions() ? > > Which is faster ? > > -- > Deepak > > -- Cell : 425-233-8271 Twitter: https://twitter.com/holdenkarau Linked In: https://www.linkedin.com/in/holdenkarau