Re: Dataframe multiple joins with same dataframe not able to resolve correct join columns

2018-07-11 Thread Ben White
Sounds like the same root cause as SPARK-14948 or SPARK-10925. A workaround is to "clone" df3 like this: val df3clone = df3.toDF(df.schema.fieldNames:_*) Then use df3clone in place of df3 in the second join. On Wed, Jul 11, 2018 at 2:52 PM Nirav Patel wrote: > I am trying to joind df1 with

Dataframe multiple joins with same dataframe not able to resolve correct join columns

2018-07-11 Thread Nirav Patel
I am trying to joind df1 with df2 and result of which to again with df2. df is a common dataframe. val df3 = df1 .join(*df2*, df1("PARTICIPANT_ID") === df2("PARTICIPANT_ID") and df1("BUSINESS_ID") === df2("BUSINESS_ID")) .drop(df1("BUSINESS_ID")) //dropping duplica