Thanks, but it still doesn't seem to work. Below is my entire code.
var mp = scala.collection.mutable.Map[VertexId, Int]() var myRdd = graph.edges.groupBy[VertexId](f).flatMap { edgesBySrc => func(edgesBySrc, a, b) } myRdd.foreach { node => { mp(node) = 1 } } Values in "mp" do not get updated for any element in "myRdd". On Tue, Feb 24, 2015 at 2:39 PM, Sean Owen <so...@cloudera.com> wrote: > Instead of > > ...foreach { > edgesBySrc => { > lst ++= func(edgesBySrc) > } > } > > try > > ...flatMap { edgesBySrc => func(edgesBySrc) } > > or even more succinctly > > ...flatMap(func) > > This returns an RDD that basically has the list you are trying to > build, I believe. > > You can collect() to the driver but beware if it is a huge data set. > > If you really just mean to count the results, you can count() instead > > On Tue, Feb 24, 2015 at 7:35 PM, Vijayasarathy Kannan <kvi...@vt.edu> > wrote: > > I am a beginner to Scala/Spark. Could you please elaborate on how to make > > RDD of results of func() and collect? > > > > > > On Tue, Feb 24, 2015 at 2:27 PM, Sean Owen <so...@cloudera.com> wrote: > >> > >> They aren't the same 'lst'. One is on your driver. It gets copied to > >> executors when the tasks are executed. Those copies are updated. But > >> the updates will never reflect in the local copy back in the driver. > >> > >> You may just wish to make an RDD of the results of func() and > >> collect() them back to the driver. > >> > >> On Tue, Feb 24, 2015 at 7:20 PM, kvvt <kvi...@vt.edu> wrote: > >> > I am working on the below piece of code. > >> > > >> > var lst = scala.collection.mutable.MutableList[VertexId]() > >> > graph.edges.groupBy[VertexId](f).foreach { > >> > edgesBySrc => { > >> > lst ++= func(edgesBySrc) > >> > } > >> > } > >> > > >> > println(lst.length) > >> > > >> > Here, the final println() always says that the length of the list is > 0. > >> > The > >> > list is non-empty (correctly prints the length of the returned list > >> > inside > >> > func()). > >> > > >> > I am not sure if I am doing the append correctly. Can someone point > out > >> > what > >> > I am doing wrong? > >> > > >> > > >> > > >> > > >> > > >> > -- > >> > View this message in context: > >> > > http://apache-spark-user-list.1001560.n3.nabble.com/Not-able-to-update-collections-tp21790.html > >> > Sent from the Apache Spark User List mailing list archive at > Nabble.com. > >> > > >> > --------------------------------------------------------------------- > >> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > >> > For additional commands, e-mail: user-h...@spark.apache.org > >> > > > > > >