[
https://issues.apache.org/jira/browse/SPARK-18597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nattavut Sutyanyong updated SPARK-18597:
----------------------------------------
Description:
The optimizer pushes down filters for left anti joins. This unfortunately has
the opposite effect. For example:
{noformat}
sql("create or replace temporary view tbl_a as values (1, 5), (2, 1), (3, 6) as
t(c1, c2)")
sql("create or replace temporary view tbl_b as values 1 as t(c1)")
sql("""
select *
from tbl_a
left anti join tbl_b on ((tbl_a.c1 = tbl_a.c2) is null or tbl_a.c1 =
tbl_a.c2)
""")
{noformat}
Should return rows [1, 5], [2, 1] & [3, 6], but returns no rows.
The upside is that this will only happen when you use a really weird anti-join
(only referencing the table on the left hand side).
was:
The optimizer pushes down filters for left anti joins. This unfortunately has
the opposite effect. For example:
{noformat}
sql("create or replace temporary view tbl_a as values (1, 5), (2, 1), (3, 6) as
t(c1, c2)")
sql("create or replace temporary view tbl_b as values 1 as t(c1)")
sql("""
select *
from tbl_a
left anti join tbl_b on ((tbl_a.c1 = tbl_a.c2) is null or tbl_a.c1 =
tbl_a.c2)
""")
{noformat}
Should return rows [2, 1] & [3, 6], but returns no rows.
The upside is that this will only happen when you use a really weird anti-join
(only referencing the table on the left hand side).
> Do not push down filters for LEFT ANTI JOIN
> -------------------------------------------
>
> Key: SPARK-18597
> URL: https://issues.apache.org/jira/browse/SPARK-18597
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Reporter: Herman van Hovell
> Assignee: Herman van Hovell
> Priority: Minor
> Labels: correctness
> Fix For: 2.1.0
>
>
> The optimizer pushes down filters for left anti joins. This unfortunately has
> the opposite effect. For example:
> {noformat}
> sql("create or replace temporary view tbl_a as values (1, 5), (2, 1), (3, 6)
> as t(c1, c2)")
> sql("create or replace temporary view tbl_b as values 1 as t(c1)")
> sql("""
> select *
> from tbl_a
> left anti join tbl_b on ((tbl_a.c1 = tbl_a.c2) is null or tbl_a.c1 =
> tbl_a.c2)
> """)
> {noformat}
> Should return rows [1, 5], [2, 1] & [3, 6], but returns no rows.
> The upside is that this will only happen when you use a really weird
> anti-join (only referencing the table on the left hand side).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]