Also, you can pass the query that you'd like to use in spark-v1.6+; val df1 = Seq((1, 0), (2, 0), (3, 0)).toDF("id", "A") val df2 = Seq((1, 0), (2, 0), (3, 0)).toDF("id", "B") df1.join(df2, df1("id") === df2("id"), "outer").show
// maropu On Wed, May 18, 2016 at 3:29 PM, ram kumar <ramkumarro...@gmail.com> wrote: > If I run as > val rs = s.join(t,"time_id").join(c,"channel_id") > > It takes as inner join. > > > On Wed, May 18, 2016 at 2:31 AM, Mich Talebzadeh < > mich.talebza...@gmail.com> wrote: > >> pretty simple, a similar construct to tables projected as DF >> >> val c = HiveContext.table("channels").select("CHANNEL_ID","CHANNEL_DESC") >> val t = HiveContext.table("times").select("TIME_ID","CALENDAR_MONTH_DESC") >> val rs = s.join(t,"time_id").join(c,"channel_id") >> >> HTH >> >> Dr Mich Talebzadeh >> >> >> >> LinkedIn * >> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >> >> >> >> http://talebzadehmich.wordpress.com >> >> >> >> On 17 May 2016 at 21:52, Bijay Kumar Pathak <bkpat...@mtu.edu> wrote: >> >>> Hi, >>> >>> Try this one: >>> >>> >>> df_join = df1.*join*(df2, 'Id', "fullouter") >>> >>> Thanks, >>> Bijay >>> >>> >>> On Tue, May 17, 2016 at 9:39 AM, ram kumar <ramkumarro...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> I tried to join two dataframe >>>> >>>> df_join = df1.*join*(df2, ((df1("Id") === df2("Id")), "fullouter") >>>> >>>> df_join.registerTempTable("join_test") >>>> >>>> >>>> When querying "Id" from "join_test" >>>> >>>> 0: jdbc:hive2://> *select Id from join_test;* >>>> *Error*: org.apache.spark.sql.AnalysisException: Reference 'Id' is >>>> *ambiguous*, could be: Id#128, Id#155.; line 1 pos 7 (state=,code=0) >>>> 0: jdbc:hive2://> >>>> >>>> Is there a way to merge the value of df1("Id") and df2("Id") into one >>>> "Id" >>>> >>>> Thanks >>>> >>> >>> >> > -- --- Takeshi Yamamuro