Thanks Akhil,Mark for your valuable comments.
Problem resolved.

AT


On Tue, Jun 9, 2015 at 2:17 PM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

> ​I think Yes, as the documentation says " Creates tuples of the elements
> in this RDD by applying f. "​
>
>
> Thanks
> Best Regards
>
> On Tue, Jun 9, 2015 at 1:54 PM, amit tewari <amittewar...@gmail.com>
> wrote:
>
>> Actually the question was will keyBy() take accept multiple fields (eg
>> x(0), x(1)) as Key?
>>
>>
>> On Tue, Jun 9, 2015 at 1:07 PM, amit tewari <amittewar...@gmail.com>
>> wrote:
>>
>>> Thanks Akhil, as you suggested, I have to go keyBy(route) as need the
>>> columns intact.
>>> But wil keyBy() take accept multiple fields (eg x(0), x(1))?
>>>
>>> Thanks
>>> Amit
>>>
>>> On Tue, Jun 9, 2015 at 12:26 PM, Akhil Das <ak...@sigmoidanalytics.com>
>>> wrote:
>>>
>>>> Try this way:
>>>>
>>>> scala>val input1 = sc.textFile("/test7").map(line =>
>>>> line.split(",").map(_.trim));
>>>> scala>val input2 = sc.textFile("/test8").map(line =>
>>>> line.split(",").map(_.trim));
>>>> scala>val input11 = input1.map(x=>(*(x(0) + x(1)*),x(2),x(3)))
>>>> scala>val input22 = input2.map(x=>(*(x(0) + x(1)*),x(2),x(3)))
>>>>
>>>> scala> input11.join(input22).take(10)
>>>>
>>>>
>>>> PairFunctions basically requires RDD[K,V] and in your case its
>>>> ((String, String), String, String). You can also look in keyBy if you don't
>>>> want to concatenate your keys.
>>>>
>>>> Thanks
>>>> Best Regards
>>>>
>>>> On Tue, Jun 9, 2015 at 10:14 AM, amit tewari <amittewar...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Dear Spark Users
>>>>>
>>>>> I am very new to Spark/Scala.
>>>>>
>>>>> Am using Datastax (4.7/Spark 1.2.1) and struggling with following
>>>>> error/issue.
>>>>>
>>>>> Already tried options like import org.apache.spark.SparkContext._ or
>>>>> explicit import org.apache.spark.SparkContext.rddToPairRDDFunctions.
>>>>> But error not resolved.
>>>>>
>>>>> Help much appreciated.
>>>>>
>>>>> Thanks
>>>>> AT
>>>>>
>>>>> scala>val input1 = sc.textFile("/test7").map(line =>
>>>>> line.split(",").map(_.trim));
>>>>> scala>val input2 = sc.textFile("/test8").map(line =>
>>>>> line.split(",").map(_.trim));
>>>>> scala>val input11 = input1.map(x=>((x(0),x(1)),x(2),x(3)))
>>>>> scala>val input22 = input2.map(x=>((x(0),x(1)),x(2),x(3)))
>>>>>
>>>>>  scala> input11.join(input22).take(10)
>>>>>
>>>>> <console>:66: error: value join is not a member of
>>>>> org.apache.spark.rdd.RDD[((String, String), String, String)]
>>>>>
>>>>>               input11.join(input22).take(10)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to