It worked! I was struggling for a week. Thanks a lot!

On Mon, Jul 14, 2014 at 12:31 PM, Xiangrui Meng [via Apache Spark User
List] <ml-node+s1001560n9591...@n3.nabble.com> wrote:

> You should return an iterator in mapPartitionsWIthIndex. This is from
> the programming guide
> (http://spark.apache.org/docs/latest/programming-guide.html):
>
> mapPartitionsWithIndex(func): Similar to mapPartitions, but also
> provides func with an integer value representing the index of the
> partition, so func must be of type (Int, Iterator<T>) => Iterator<U>
> when running on an RDD of type T.
>
> For your case, try something similar to the following:
>
> val keyval=dRDD.mapPartitionsWithIndex { (ind,iter) =>
>   iter.map(x => process(ind,x.trim().split(' ').map(_.toDouble),q,m,r))
> }
>
> -Xiangrui
>
> On Sun, Jul 13, 2014 at 11:26 PM, Madhura <[hidden email]
> <http://user/SendEmail.jtp?type=node&node=9591&i=0>> wrote:
>
> > I have a text file consisting of a large number of random floating
> values
> > separated by spaces. I am loading this file into a RDD in scala.
> >
> > I have heard of mapPartitionsWithIndex but I haven't been able to
> implement
> > it. For each partition I want to call a method(process in this case) to
> > which I want to pass the partition and it's respective index as
> parameters.
> >
> > My method returns a pair of values.
> > This is what I have done.
> >
> > val dRDD = sc.textFile("hdfs://master:54310/Data/input*")
> > var ind:Int=0
> > val keyval= dRDD.mapPartitionsWithIndex((ind,x) => process(ind,x,...))
> > val res=keyval.collect()
> >
> > We are not able to access res(0)._1 and res(0)._2
> >
> > The error log is as follows.
> >
> > [error] SimpleApp.scala:420: value trim is not a member of
> Iterator[String]
> > [error] Error occurred in an application involving default arguments.
> > [error]     val keyval=dRDD.mapPartitionsWithIndex( (ind,x) =>
> > process(ind,x.trim().split(' ').map(_.toDouble),q,m,r))
> > [error]
> > ^
> > [error] SimpleApp.scala:425: value mkString is not a member of
> > Array[Nothing]
> > [error]       println(res.mkString(""))
> > [error]                   ^
> > [error] /SimpleApp.scala:427: value _1 is not a member of Nothing
> > [error]       var final= res(0)._1
> > [error]                             ^
> > [error] /home/madhura/DTWspark/src/main/scala/SimpleApp.scala:428: value
> _2
> > is not a member of Nothing
> > [error]       var final1 = res(0)._2 - m +1
> > [error]                                  ^
> >
> >
> >
> >
> > --
> > View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/mapPartitionsWithIndex-tp9590.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-user-list.1001560.n3.nabble.com/mapPartitionsWithIndex-tp9590p9591.html
>  To unsubscribe from mapPartitionsWithIndex, click here
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=9590&code=ZGFzLm1hZGh1cmE5NEBnbWFpbC5jb218OTU5MHwtMTcyNjUwNDQ1Mg==>
> .
> NAML
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/mapPartitionsWithIndex-tp9590p9598.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to