I wonder if Spark can provide better support for this case. The following schema is not user friendly (shown previsouly):
StructField(b,IntegerType,false), StructField(b,IntegerType,false) Except for 'select *', there is no way for user to query any of the two fields. On Tue, Apr 26, 2016 at 10:17 PM, Takeshi Yamamuro <linguin....@gmail.com> wrote: > Based on my example, how about renaming columns? > > val df1 = Seq((1, 1), (2, 2), (3, 3)).toDF("a", "b") > val df2 = Seq((1, 1), (2, 2), (3, 3)).toDF("a", "b") > val df3 = df1.join(df2, "a").select($"a", df1("b").as("1-b"), > df2("b").as("2-b")) > val df4 = df3.join(df2, df3("2-b") === df2("b")) > > // maropu > > On Wed, Apr 27, 2016 at 1:58 PM, Divya Gehlot <divya.htco...@gmail.com> > wrote: > >> Correct Takeshi >> Even I am facing the same issue . >> >> How to avoid the ambiguity ? >> >> >> On 27 April 2016 at 11:54, Takeshi Yamamuro <linguin....@gmail.com> >> wrote: >> >>> Hi, >>> >>> I tried; >>> val df1 = Seq((1, 1), (2, 2), (3, 3)).toDF("a", "b") >>> val df2 = Seq((1, 1), (2, 2), (3, 3)).toDF("a", "b") >>> val df3 = df1.join(df2, "a") >>> val df4 = df3.join(df2, "b") >>> >>> And I got; org.apache.spark.sql.AnalysisException: Reference 'b' is >>> ambiguous, could be: b#6, b#14.; >>> If same case, this message makes sense and this is clear. >>> >>> Thought? >>> >>> // maropu >>> >>> >>> >>> >>> >>> >>> >>> On Wed, Apr 27, 2016 at 6:09 AM, Prasad Ravilla <pras...@slalom.com> >>> wrote: >>> >>>> Also, check the column names of df1 ( after joining df2 and df3 ). >>>> >>>> Prasad. >>>> >>>> From: Ted Yu >>>> Date: Monday, April 25, 2016 at 8:35 PM >>>> To: Divya Gehlot >>>> Cc: "user @spark" >>>> Subject: Re: Cant join same dataframe twice ? >>>> >>>> Can you show us the structure of df2 and df3 ? >>>> >>>> Thanks >>>> >>>> On Mon, Apr 25, 2016 at 8:23 PM, Divya Gehlot <divya.htco...@gmail.com> >>>> wrote: >>>> >>>>> Hi, >>>>> I am using Spark 1.5.2 . >>>>> I have a use case where I need to join the same dataframe twice on two >>>>> different columns. >>>>> I am getting error missing Columns >>>>> >>>>> For instance , >>>>> val df1 = df2.join(df3,"Column1") >>>>> Below throwing error missing columns >>>>> val df 4 = df1.join(df3,"Column2") >>>>> >>>>> Is the bug or valid scenario ? >>>>> >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> Divya >>>>> >>>> >>>> >>> >>> >>> -- >>> --- >>> Takeshi Yamamuro >>> >> >> > > > -- > --- > Takeshi Yamamuro >