[ https://issues.apache.org/jira/browse/HIVE-19653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16492913#comment-16492913 ]
Ashutosh Chauhan commented on HIVE-19653: ----------------------------------------- +1 > Incorrect predicate pushdown for groupby with grouping sets > ----------------------------------------------------------- > > Key: HIVE-19653 > URL: https://issues.apache.org/jira/browse/HIVE-19653 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer > Affects Versions: 4.0.0 > Reporter: Zhang Li > Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-19653.patch > > > Consider the following query: > {code:java} > CREATE TABLE T1(a STRING, b STRING, s BIGINT); > INSERT OVERWRITE TABLE T1 VALUES ('aaaa', 'bbbb', 123456); > SELECT * FROM ( > SELECT a, b, sum(s) > FROM T1 > GROUP BY a, b GROUPING SETS ((), (a), (b), (a, b)) > ) t WHERE a IS NOT NULL; > {code} > When hive.optimize.ppd is enabled (and hive.cbo.enable=false), the query will > output: > {code:java} > NULL NULL 123456 > NULL bbbb 123456 > aaaa NULL 123456 > aaaa bbbb 123456 > {code} > We can see the predicate "a IS NOT NULL" takes no effect, which is incorrect. > When performing PPD optimization for a GBY operator, we should make sure all > grouping sets contains the processing expr before pushdown. otherwise the > expr value after GBY is changed and the result is wrong. -- This message was sent by Atlassian JIRA (v7.6.3#76005)