Unlike a map() wherein your task is acting on a row at a time, with
mapPartitions(), the task is passed the  entire content of the partition in
an iterator. You can then return back another iterator as the output. I
don't do scala, but from what I understand from your code snippet... The
iterator x can return all the rows in the partition. But you are returning
back after consuming the first row. Hence you see only 1,4,7 in your
output. These are the first rows of each of your 3 partitions.

Regards
Sab
On 18-Mar-2015 10:50 pm, "ashish.usoni" <[email protected]> wrote:

> I am trying to understand about mapPartitions but i am still not sure how
> it
> works
>
> in the below example it create three partition
> val parallel = sc.parallelize(1 to 10, 3)
>
> and when we do below
> parallel.mapPartitions( x => List(x.next).iterator).collect
>
> it prints value
> Array[Int] = Array(1, 4, 7)
>
> Can some one please explain why it prints 1,4,7 only
>
> Thanks,
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/mapPartitions-How-Does-it-Works-tp22123.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to