> On Jan. 16, 2015, 6:20 p.m., Ashutosh Chauhan wrote: > > overall looks good. Few minor comments.
Thanks for the review. Fixed BucketingSortingInferenceOptimizer to reflect grouping set pruning and changed default value of grouping set id to -1 in semijoin cases. > On Jan. 16, 2015, 6:20 p.m., Ashutosh Chauhan wrote: > > ql/src/test/results/clientpositive/infer_bucket_sort_grouping_operators.q.out, > > line 147 > > <https://reviews.apache.org/r/29878/diff/1/?file=820775#file820775line147> > > > > should this have been only key,value ? Fixed BucketingSortingInferenceOptimizer to reflect grouping set pruning > On Jan. 16, 2015, 6:20 p.m., Ashutosh Chauhan wrote: > > ql/src/test/results/clientpositive/infer_bucket_sort_grouping_operators.q.out, > > line 334 > > <https://reviews.apache.org/r/29878/diff/1/?file=820775#file820775line334> > > > > seems like agg column should not be present there? Can you check? Same with above. - Navis ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29878/#review68438 ----------------------------------------------------------- On Jan. 14, 2015, 9:13 a.m., Navis Ryu wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/29878/ > ----------------------------------------------------------- > > (Updated Jan. 14, 2015, 9:13 a.m.) > > > Review request for hive. > > > Bugs: HIVE-9347 > https://issues.apache.org/jira/browse/HIVE-9347 > > > Repository: hive-git > > > Description > ------- > > It looks like the query below returns incorrect results on Hive 0.13.1, but > it was working fine on Hive 0.11. > > I have the following table: > CREATE TABLE `t`( > `category` int, > `live` int, > `comments` int) > > with the following data: > hive> select * from t; > OK > 3 0 2 > 2 0 2 > 8 0 2 > > The query: > hive> select category, max(live) live, max(comments) comments, rank() OVER > (PARTITION BY category ORDER BY comments) rank1 > FROM t > GROUP BY category > GROUPING SETS ((), (category)) > HAVING max(comments) > 0; > > return the following results: > > NULL 1 48 1 > 2 1 49 1 > 3 1 49 1 > 8 1 49 1 > > When using grouping sets with the rank() function the max() function return > incorrect results. Everything works fine if I remove grouping sets clause and > split the query into two independent queries or remove the rank() function. > > This looks like a bug to me but please review. That said, I'm not sure if > it's just Amazon issue or general Hive issue. > > > Diffs > ----- > > ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 4632f08 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java > 90b4b12 > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java > afd1738 > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java cea86df > ql/src/test/queries/clientpositive/groupby_grouping_window.q PRE-CREATION > ql/src/test/results/clientpositive/annotate_stats_groupby.q.out 89dd1de > ql/src/test/results/clientpositive/annotate_stats_groupby2.q.out c3bf0d8 > ql/src/test/results/clientpositive/groupby_cube1.q.out 4f44d4f > ql/src/test/results/clientpositive/groupby_grouping_sets2.q.out 0a21dbe > ql/src/test/results/clientpositive/groupby_grouping_sets3.q.out 3597609 > ql/src/test/results/clientpositive/groupby_grouping_sets4.q.out d1be46d > ql/src/test/results/clientpositive/groupby_grouping_sets5.q.out 6d11add > ql/src/test/results/clientpositive/groupby_grouping_sets6.q.out d2ff112 > ql/src/test/results/clientpositive/groupby_rollup1.q.out 0108ce0 > > ql/src/test/results/clientpositive/infer_bucket_sort_grouping_operators.q.out > 301b90c > > Diff: https://reviews.apache.org/r/29878/diff/ > > > Testing > ------- > > > Thanks, > > Navis Ryu > >