[68] at explain at :25
Could Spark SQL developers suggest why it happens?
Best regards, Alexander
From: Stephen Carman [mailto:scar...@coldlight.com]
Sent: Wednesday, June 24, 2015 12:33 PM
To: Ulanov, Alexander
Cc: CC GP; dev@spark.apache.org
Subject: Re: Force inner join to shuffle the smallest
...@hp.com>> wrote:
It also fails, as I mentioned in the original question.
From: CC GP [mailto:chandrika.gopalakris...@gmail.com]
Sent: Wednesday, June 24, 2015 12:08 PM
To: Ulanov, Alexander
Cc: dev@spark.apache.org<mailto:dev@spark.apache.org>
Subject: Re: Force inner join to shuffle the sm
It also fails, as I mentioned in the original question.
From: CC GP [mailto:chandrika.gopalakris...@gmail.com]
Sent: Wednesday, June 24, 2015 12:08 PM
To: Ulanov, Alexander
Cc: dev@spark.apache.org
Subject: Re: Force inner join to shuffle the smallest table
Try below and see if it makes a
Try below and see if it makes a difference:
val result = sqlContext.sql(“select big.f1, big.f2 from small inner join
big on big.s=small.s and big.d=small.d”)
On Wed, Jun 24, 2015 at 11:35 AM, Ulanov, Alexander wrote:
> Hi,
>
>
>
> I try to inner join of two tables on two fields(string and doub
Hi,
I try to inner join of two tables on two fields(string and double). One table
is 2B rows, the second is 500K. They are stored in HDFS in Parquet. Spark v 1.4.
val big = sqlContext.paquetFile("hdfs://big")
data.registerTempTable("big")
val small = sqlContext.paquetFile("hdfs://small")
data.reg