Also, you can pass the query that you'd like to use in spark-v1.6+;

val df1 = Seq((1, 0), (2, 0), (3, 0)).toDF("id", "A")
val df2 = Seq((1, 0), (2, 0), (3, 0)).toDF("id", "B")
df1.join(df2, df1("id") === df2("id"), "outer").show

// maropu


On Wed, May 18, 2016 at 3:29 PM, ram kumar <ramkumarro...@gmail.com> wrote:

> If I run as
> val rs = s.join(t,"time_id").join(c,"channel_id")
>
> It takes as inner join.
>
>
> On Wed, May 18, 2016 at 2:31 AM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> pretty simple, a similar construct to tables projected as DF
>>
>> val c = HiveContext.table("channels").select("CHANNEL_ID","CHANNEL_DESC")
>> val t = HiveContext.table("times").select("TIME_ID","CALENDAR_MONTH_DESC")
>> val rs = s.join(t,"time_id").join(c,"channel_id")
>>
>> HTH
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 17 May 2016 at 21:52, Bijay Kumar Pathak <bkpat...@mtu.edu> wrote:
>>
>>> Hi,
>>>
>>> Try this one:
>>>
>>>
>>> df_join = df1.*join*(df2, 'Id', "fullouter")
>>>
>>> Thanks,
>>> Bijay
>>>
>>>
>>> On Tue, May 17, 2016 at 9:39 AM, ram kumar <ramkumarro...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I tried to join two dataframe
>>>>
>>>> df_join = df1.*join*(df2, ((df1("Id") === df2("Id")), "fullouter")
>>>>
>>>> df_join.registerTempTable("join_test")
>>>>
>>>>
>>>> When querying "Id" from "join_test"
>>>>
>>>> 0: jdbc:hive2://> *select Id from join_test;*
>>>> *Error*: org.apache.spark.sql.AnalysisException: Reference 'Id' is
>>>> *ambiguous*, could be: Id#128, Id#155.; line 1 pos 7 (state=,code=0)
>>>> 0: jdbc:hive2://>
>>>>
>>>> Is there a way to merge the value of df1("Id") and df2("Id") into one
>>>> "Id"
>>>>
>>>> Thanks
>>>>
>>>
>>>
>>
>


-- 
---
Takeshi Yamamuro

Reply via email to