;>>> Hi,
>>>>>
>>>>> I am trying to port some code that was working in Spark 1.2.0 on the
>>>>> latest
>>>>> version, Spark 1.3.0. This code involves a left outer join between two
>>>>> SchemaRDDs which
taFrames. I followed the example for left outer join of DataFrame at
>>>>
>>>> https://databricks.com/blog/2015/02/17/introducing-dataframes-in-spark-for-large-scale-data-science.html
>>>>
>>>> Here's my code, where df1 and df2 are the 2 dataf
f1 and df2 are the 2 dataframes I am joining on
>>> the
>>> "country" field:
>>>
>>> val join_df = df1.join( df2, df1.country == df2.country, "left_outer")
>>>
>>> But I got a compilation erro
gt;> "left_outer")
>>
>> I got a compilation error that it is a Boolean whereas a Column is
>> required.
>>
>> So what is the correct Column expression I need to provide for joining
>> the 2
>> dataframes on a specific field ?
&g
, df1("country") == df2("country"),
> "left_outer")
>
> I got a compilation error that it is a Boolean whereas a Column is
> required.
>
> So what is the correct Column expression I need to provide for joining the
> 2
> dataframes on a specific
d.
So what is the correct Column expression I need to provide for joining the 2
dataframes on a specific field ?
thanks
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/column-expression-in-left-outer-join-for-DataFrame-tp22209.html
Sent from the Ap