----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29878/ -----------------------------------------------------------
(Updated Jan. 19, 2015, 1:10 a.m.) Review request for hive. Changes ------- Addressed comments Bugs: HIVE-9347 https://issues.apache.org/jira/browse/HIVE-9347 Repository: hive-git Description ------- It looks like the query below returns incorrect results on Hive 0.13.1, but it was working fine on Hive 0.11. I have the following table: CREATE TABLE `t`( `category` int, `live` int, `comments` int) with the following data: hive> select * from t; OK 3 0 2 2 0 2 8 0 2 The query: hive> select category, max(live) live, max(comments) comments, rank() OVER (PARTITION BY category ORDER BY comments) rank1 FROM t GROUP BY category GROUPING SETS ((), (category)) HAVING max(comments) > 0; return the following results: NULL 1 48 1 2 1 49 1 3 1 49 1 8 1 49 1 When using grouping sets with the rank() function the max() function return incorrect results. Everything works fine if I remove grouping sets clause and split the query into two independent queries or remove the rank() function. This looks like a bug to me but please review. That said, I'm not sure if it's just Amazon issue or general Hive issue. Diffs (updated) ----- ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 4632f08 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java 90b4b12 ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java afd1738 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/BucketingSortingInferenceOptimizer.java 87fba2d ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/BucketingSortingOpProcFactory.java 82f4243 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b93a293 ql/src/java/org/apache/hadoop/hive/ql/plan/GroupByDesc.java 7a0b0da Diff: https://reviews.apache.org/r/29878/diff/ Testing ------- Thanks, Navis Ryu