you could join, it'll give you the intersection and a list of the labels where the value was found.
> a.join(b).collect Array[(String, (String, String))] = Array((4,(a,b)), (3,(a,b))) best, matt On 08/31/2014 09:23 PM, Liu, Raymond wrote: > You could use cogroup to combine RDDs in one RDD for cross reference > processing. > > e.g. > > a.cogroup(b). filter{case (_, (l,r)) => l.nonEmpty && r.nonEmpty }. map{case > (k,(l,r)) => (k, l)} > > Best Regards, > Raymond Liu > > -----Original Message----- > From: marylucy [mailto:qaz163wsx_...@hotmail.com] > Sent: Friday, August 29, 2014 9:26 PM > To: Matthew Farrellee > Cc: user@spark.apache.org > Subject: Re: how to filter value in spark > > i see it works well,thank you!!! > > But in follow situation how to do > > var a = sc.textFile("/sparktest/1/").map((_,"a")) > var b = sc.textFile("/sparktest/2/").map((_,"b")) > How to get (3,"a") and (4,"a")???? > > > 在 Aug 28, 2014,19:54,"Matthew Farrellee" <m...@redhat.com> 写道: > >> On 08/28/2014 07:20 AM, marylucy wrote: >>> fileA=1 2 3 4 one number a line,save in /sparktest/1/ >>> fileB=3 4 5 6 one number a line,save in /sparktest/2/ I want to get >>> 3 and 4 >>> >>> var a = sc.textFile("/sparktest/1/").map((_,1)) >>> var b = sc.textFile("/sparktest/2/").map((_,1)) >>> >>> a.filter(param=>{b.lookup(param._1).length>0}).map(_._1).foreach(prin >>> tln) >>> >>> Error throw >>> Scala.MatchError:Null >>> PairRDDFunctions.lookup... >> >> the issue is nesting of the b rdd inside a transformation of the a rdd >> >> consider using intersection, it's more idiomatic >> >> a.intersection(b).foreach(println) >> >> but not that intersection will remove duplicates >> >> best, >> >> >> matt >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For >> additional commands, e-mail: user-h...@spark.apache.org >> > B�KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCB??[��X�剀�X�KK[XZ[ > ?\�\�][��X�剀�X�P?\���\X?K����B����Y][��[圹[X[??K[XZ[ > ?\�\�Z[?\���\X?K����B�B > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org