cool let me adapt that. thanks a tonregardssanjay
From: Sean Owen
To: Sanjay Subramanian
Cc: "user@spark.apache.org"
Sent: Monday, January 5, 2015 3:19 AM
Subject: Re: FlatMapValues
For the record, the solution I was suggesting was about like this:
inputRDD.flatM
For the record, the solution I was suggesting was about like this:
inputRDD.flatMap { input =>
val tokens = input.split(',')
val id = tokens(0)
val keyValuePairs = tokens.tail.grouped(2)
val keys = keyValuePairs.map(_(0))
keys.map(key => (id, key))
}
This is much more efficient.
On Wed
else {
("")
}
}).flatMap(str => str.split('\t')).filter(line =>
line.toString.length() > 0).saveAsTextFile("/data/vaers/msfx/reac/" + outFile)
From: Sanjay Subramanian
To: Hitesh Khamesra
Cc:
thanks let me try that out
From: Hitesh Khamesra
To: Sanjay Subramanian
Cc: Kapil Malik ; Sean Owen ;
"user@spark.apache.org"
Sent: Thursday, January 1, 2015 9:46 AM
Subject: Re: FlatMapValues
How about this..apply flatmap on per line. And in that function, parse each
nd you need
> to import org.apache.spark.rdd.SparkContext._ to use them
> (http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.rdd.PairRDDFunctions
> )
>
> @Sean, yes indeed flatMap / flatMapValues both can be used.
>
> Regards,
>
> Kapil
>
>
&
,Injection site oedema025005,Injection site reaction
thanks
sanjay
From: Kapil Malik
To: Sean Owen ; Sanjay Subramanian
Cc: "user@spark.apache.org"
Sent: Wednesday, December 31, 2014 9:35 AM
Subject: RE: FlatMapValues
Hi Sanjay,
Oh yes .. on flatMapValues, it
h can be used.
Regards,
Kapil
-Original Message-
From: Sean Owen [mailto:so...@cloudera.com]
Sent: 31 December 2014 21:16
To: Sanjay Subramanian
Cc: user@spark.apache.org
Subject: Re: FlatMapValues
From the clarification below, the problem is that you are calling
flatMapValues, whi
>From the clarification below, the problem is that you are calling
flatMapValues, which is only available on an RDD of key-value tuples.
Your map function returns a tuple in one case but a String in the
other, so your RDD is a bunch of Any, which is not at all what you
want. You need to return a tu
ks
regards
sanjay
From: Fernando O.
To: Kapil Malik
Cc: Sanjay Subramanian ; "user@spark.apache.org"
Sent: Wednesday, December 31, 2014 6:06 AM
Subject: Re: FlatMapValues
Hi Sanjay,
Doing an if inside a Map sounds like a bad idea, it seems like you actually
want to filter and
Hi Sanjay,
Doing an if inside a Map sounds like a bad idea, it seems like you actually
want to filter and then apply map
On Wed, Dec 31, 2014 at 9:54 AM, Kapil Malik wrote:
> Hi Sanjay,
>
>
>
> I tried running your code on spark shell piece by piece –
>
>
>
> // Setup
>
> val line1 = “025126,C
Hi Sanjay,
I tried running your code on spark shell piece by piece –
// Setup
val line1 = “025126,Chills,8.10,Injection site oedema,8.10,Injection site
reaction,8.10,Malaise,8.10,Myalgia,8.10”
val line2 = “025127,Chills,8.10,Injection site oedema,8.10,Injection site
reaction,8.10,Malaise,8.10,M
Why don't you push "\n" instead of "\t" in your first transformation [
(fields(0),(fields(1)+"\t"+fields(3)+"\t"+fields(5)+"\t"+fields(7)+"\t"
+fields(9)))] and then do saveAsTextFile?
-Raghavendra
On Wed Dec 31 2014 at 1:42:55 PM Sanjay Subramanian
wrote:
> hey guys
>
> My dataset is like this
12 matches
Mail list logo