GitHub user frreiss opened a pull request:
https://github.com/apache/spark/pull/13155
[SPARK-15370] [SQL] Update RewriteCorrelatedScalarSubquery rule to fix
COUNT bug
## What changes were proposed in this pull request?
This pull request fixes the COUNT bug in the
`RewriteCorrelatedScalarSubquery` rule.
After this change, the rule tests the expression at the root of the
correlated subquery to determine whether the expression returns NULL on empty
input. If the expression does not return NULL, the rule generates additional
logic in the Project operator above the rewritten subquery. This additional
logic intercepts NULL values coming from the outer join and replaces them with
the value that the subquery's expression would return on empty input.
## How was this patch tested?
Added regression tests to cover all branches of the updated rule (see
changes to `SubquerySuite.scala`).
Ran all existing automated regression tests after merging with latest trunk.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/frreiss/spark-sandbox master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/13155.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #13155
----
commit 3b1649105869c72ccb16f86732e04829aaae0e93
Author: frreiss <[email protected]>
Date: 2016-05-16T17:58:00Z
Commit before merge.
commit 58df60d5468e53c4b6fc41a1d7c896abfb01cdd1
Author: frreiss <[email protected]>
Date: 2016-05-16T17:58:21Z
Merge branch 'master' of https://github.com/apache/spark
commit 910cbf54e2300a57640e017610c204da2d462964
Author: frreiss <[email protected]>
Date: 2016-05-16T20:46:55Z
Merge branch 'master' of https://github.com/apache/spark
commit 76d9f4528b8536d1e5680279ab76b9e26dd3a873
Author: frreiss <[email protected]>
Date: 2016-05-17T14:52:46Z
Merge branch 'master' of https://github.com/apache/spark
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]