Thanks Akhil,Mark for your valuable comments. Problem resolved. AT
On Tue, Jun 9, 2015 at 2:17 PM, Akhil Das <ak...@sigmoidanalytics.com> wrote: > I think Yes, as the documentation says " Creates tuples of the elements > in this RDD by applying f. " > > > Thanks > Best Regards > > On Tue, Jun 9, 2015 at 1:54 PM, amit tewari <amittewar...@gmail.com> > wrote: > >> Actually the question was will keyBy() take accept multiple fields (eg >> x(0), x(1)) as Key? >> >> >> On Tue, Jun 9, 2015 at 1:07 PM, amit tewari <amittewar...@gmail.com> >> wrote: >> >>> Thanks Akhil, as you suggested, I have to go keyBy(route) as need the >>> columns intact. >>> But wil keyBy() take accept multiple fields (eg x(0), x(1))? >>> >>> Thanks >>> Amit >>> >>> On Tue, Jun 9, 2015 at 12:26 PM, Akhil Das <ak...@sigmoidanalytics.com> >>> wrote: >>> >>>> Try this way: >>>> >>>> scala>val input1 = sc.textFile("/test7").map(line => >>>> line.split(",").map(_.trim)); >>>> scala>val input2 = sc.textFile("/test8").map(line => >>>> line.split(",").map(_.trim)); >>>> scala>val input11 = input1.map(x=>(*(x(0) + x(1)*),x(2),x(3))) >>>> scala>val input22 = input2.map(x=>(*(x(0) + x(1)*),x(2),x(3))) >>>> >>>> scala> input11.join(input22).take(10) >>>> >>>> >>>> PairFunctions basically requires RDD[K,V] and in your case its >>>> ((String, String), String, String). You can also look in keyBy if you don't >>>> want to concatenate your keys. >>>> >>>> Thanks >>>> Best Regards >>>> >>>> On Tue, Jun 9, 2015 at 10:14 AM, amit tewari <amittewar...@gmail.com> >>>> wrote: >>>> >>>>> Hi Dear Spark Users >>>>> >>>>> I am very new to Spark/Scala. >>>>> >>>>> Am using Datastax (4.7/Spark 1.2.1) and struggling with following >>>>> error/issue. >>>>> >>>>> Already tried options like import org.apache.spark.SparkContext._ or >>>>> explicit import org.apache.spark.SparkContext.rddToPairRDDFunctions. >>>>> But error not resolved. >>>>> >>>>> Help much appreciated. >>>>> >>>>> Thanks >>>>> AT >>>>> >>>>> scala>val input1 = sc.textFile("/test7").map(line => >>>>> line.split(",").map(_.trim)); >>>>> scala>val input2 = sc.textFile("/test8").map(line => >>>>> line.split(",").map(_.trim)); >>>>> scala>val input11 = input1.map(x=>((x(0),x(1)),x(2),x(3))) >>>>> scala>val input22 = input2.map(x=>((x(0),x(1)),x(2),x(3))) >>>>> >>>>> scala> input11.join(input22).take(10) >>>>> >>>>> <console>:66: error: value join is not a member of >>>>> org.apache.spark.rdd.RDD[((String, String), String, String)] >>>>> >>>>> input11.join(input22).take(10) >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>> >> >