Spark 2.0 is EOL. Can you try 2.3 or 2.4?
On Thu, Mar 21, 2019 at 10:23 AM asma zgolli wrote:
> Hello ,
>
> I need to cross my data and i'm executing a cross join on two dataframes .
>
> C = A.crossJoin(B)
> A has 50 records
> B has 5 records
>
> the resul
Hello ,
I need to cross my data and i'm executing a cross join on two dataframes .
C = A.crossJoin(B)
A has 50 records
B has 5 records
the result im getting with spark 2.0 is a dataframe C having 50 records.
only the first row from B was added to C.
Is that a bug in Spark?
Asma ZGOLLI
Thanks Andrew. I completely missed that. It worked by removing the null safe
join condition.
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Scala-left-join-with-multiple-columns-Join-condition-is-missing-or-trivial-Use-the-CROSS-JOIN-syntax
ode").between(8, 27).isNotNull)
> .groupBy(
> baseDF("medical_claim_id"),
> baseDF("medical_claim_detail_id")
> )
> .agg(min(revCdDF("rtos_2_code").alias("min_rtos_2_8_thru_27")),
> min(revCdDF("rtos_2_hierarchy").alias("min_rtos_2_8_thru_27_hier")))
>
> If I remove the multiple Columns in the join and create a join statement
> for
> each one then the exception goes away. Is there a better way to join
> multiple columns?
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Scala-left-join-with-multiple-columns-Join-condition-is-missing-or-trivial-Use-the-CROSS-JOIN-syntax-tp21297.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>
between(8, 27).isNotNull)
.groupBy(
baseDF("medical_claim_id"),
baseDF("medical_claim_detail_id")
)
.agg(min(revCdDF("rtos_2_code").alias("min_rtos_2_8_thru_27")),
min(revCdDF("rtos_2_hierarchy").alias("mi