Re: RDD with a Map

2014-06-04 Thread Amit
Yes, RDD as a map of String keys and List of string as values. Amit On Jun 4, 2014, at 2:46, Oleg Proudnikov wrote: > Just a thought... Are you trying to use use the RDD as a Map? > > > > On 3 June 2014 23:14, Doris Xin wrote: > Hey Amit, > > You might want to check out PairRDDFunctions. F

Re: RDD with a Map

2014-06-04 Thread Amit
Thanks folks. I was trying to get the RDD[multimap] so the collectAsMap is what I needed. Best, Amit On Jun 4, 2014, at 6:53, Cheng Lian wrote: > On Wed, Jun 4, 2014 at 5:56 AM, Amit Kumar wrote: > > Hi Folks, > > I am new to spark -and this is probably a basic question. > > I have a file

Re: RDD with a Map

2014-06-04 Thread Cheng Lian
On Wed, Jun 4, 2014 at 5:56 AM, Amit Kumar wrote: Hi Folks, > > I am new to spark -and this is probably a basic question. > > I have a file on the hdfs > > 1, one > 1, uno > 2, two > 2, dos > > I want to create a multi Map RDD RDD[Map[String,List[String]]] > > {"1"->["one","uno"], "2"->["two","d

Re: RDD with a Map

2014-06-04 Thread Oleg Proudnikov
Just a thought... Are you trying to use use the RDD as a Map? On 3 June 2014 23:14, Doris Xin wrote: > Hey Amit, > > You might want to check out PairRDDFunctions > . > For your use case in particula

Re: RDD with a Map

2014-06-03 Thread Doris Xin
Hey Amit, You might want to check out PairRDDFunctions . For your use case in particular, you can load the file as a RDD[(String, String)] and then use the groupByKey() function in PairRDDFunctions to g

Re: RDD with a Map

2014-06-03 Thread Ian O'Connell
So if your data can be kept in memory on the driver node then you don't really need spark? If you want to use it for hadoop reading then i'd immediately call collect after you open it and then you can do normal scala collections operations. On Tue, Jun 3, 2014 at 2:56 PM, Amit Kumar wrote: > Hi

RDD with a Map

2014-06-03 Thread Amit Kumar
Hi Folks, I am new to spark -and this is probably a basic question. I have a file on the hdfs 1, one 1, uno 2, two 2, dos I want to create a multi Map RDD RDD[Map[String,List[String]]] {"1"->["one","uno"], "2"->["two","dos"]} First I read the file val identityData:RDD[String] = sc.textFile(