For outer joins I'd recommend upgrading to master or waiting for a 1.1
release candidate (which should be out this week).
On Tue, Aug 5, 2014 at 7:38 AM, Dima Zhiyanov
wrote:
> I am also experiencing this kryo buffer problem. My join is left outer with
> under 40mb on the right side. I would ex
Yes
Sent from my iPhone
> On Aug 5, 2014, at 7:38 AM, "Dima Zhiyanov [via Apache Spark User List]"
> wrote:
>
> I am also experiencing this kryo buffer problem. My join is left outer with
> under 40mb on the right side. I would expect the broadcast join to succeed
> in this case (hive did)
I am also experiencing this kryo buffer problem. My join is left outer with
under 40mb on the right side. I would expect the broadcast join to succeed
in this case (hive did)
Another problem is that the optimizer
chose nested loop join for some reason
I would expect broadcast (map side) hash join.
>
> When SPARK-2211 is done, will spark sql automatically choose join
> algorithms?
> Is there some way to manually hint the optimizer?
>
Ideally we will select the best algorithm for you. We are also considering
ways to allow the user to hint.
Hi Michael,
Thanks for the suggestion. In my query, both table are too large to use
broadcast join.
When SPARK-2211 is done, will spark sql automatically choose join
algorithms?
Is there some way to manually hint the optimizer?
2014-07-19 5:23 GMT+08:00 Michael Armbrust :
> Unfortunately, this
Unfortunately, this is a query where we just don't have an efficiently
implementation yet. You might try switching the table order.
Here is the JIRA for doing something more efficient:
https://issues.apache.org/jira/browse/SPARK-2212
On Fri, Jul 18, 2014 at 7:05 AM, Pei-Lun Lee wrote:
> Hi,
>