[ https://issues.apache.org/jira/browse/HIVE-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073604#comment-13073604 ]
jirapos...@reviews.apache.org commented on HIVE-1694: ----------------------------------------------------- bq. On 2011-07-28 21:40:30, John Sichi wrote: bq. > ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java, line 61 bq. > <https://reviews.apache.org/r/1194/diff/1/?file=27052#file27052line61> bq. > bq. > Please run ant checkstyle and fix all the formatting discrepancies it reports for your new files. bq. > Done! The code is still having checkstyle formatting errors only for places where we have used LinkedHashMap, HashMap and ArrayList. The error states "Declaring variables, return values or parameters of type 'HashMap' is not allowed". bq. On 2011-07-28 21:40:30, John Sichi wrote: bq. > ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java, line 184 bq. > <https://reviews.apache.org/r/1194/diff/1/?file=27052#file27052line184> bq. > bq. > Don't you need to reuse the compact implementation here so that the index can be used for WHERE (not just GROUP BY)? bq. > The AggregateIndexHandler now extends from CompactIndexHandler instead of TableBasedIndexHandler. We override only analyzeIndexDefinition(...) and getIndexBuilderMapRedTask(...) methods from CompactIndexHandler. bq. On 2011-07-28 21:40:30, John Sichi wrote: bq. > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java, line 700 bq. > <https://reviews.apache.org/r/1194/diff/1/?file=27054#file27054line700> bq. > bq. > This method is redundant now. Removed. Sorry to have missed that. bq. On 2011-07-28 21:40:30, John Sichi wrote: bq. > ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java, line 252 bq. > <https://reviews.apache.org/r/1194/diff/1/?file=27056#file27056line252> bq. > bq. > I can't think of a case where it would be worse. Ok. bq. On 2011-07-28 21:40:30, John Sichi wrote: bq. > ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyProcFactory.java, line 164 bq. > <https://reviews.apache.org/r/1194/diff/1/?file=27057#file27057line164> bq. > bq. > Actually group-by is now preserved in all cases. Forgot to change a few comments after fixing the bug. Done! bq. On 2011-07-28 21:40:30, John Sichi wrote: bq. > ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteGBUsingIndex.java, line 66 bq. > <https://reviews.apache.org/r/1194/diff/1/?file=27058#file27058line66> bq. > bq. > Please use HTML bullet syntax for Javadoc (otherwise it all gets run together into one line when rendered). bq. > Done! bq. On 2011-07-28 21:40:30, John Sichi wrote: bq. > ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndex.java, line 93 bq. > <https://reviews.apache.org/r/1194/diff/1/?file=27060#file27060line93> bq. > bq. > Shouldn't this be BIGINT? bq. > bq. > Also, I think you're supposed to use a TypeInfoFactory for this purpose. Yes. Changed it to bigint. Also changed it in AggregateIndexHandler where I had declared the type to be "int". bq. On 2011-07-28 21:40:30, John Sichi wrote: bq. > ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java, line 603 bq. > <https://reviews.apache.org/r/1194/diff/1/?file=27062#file27062line603> bq. > bq. > Not sure why this new constructor is needed...after using it, all you do is get the table out of it. The only other constructor option for tableSpec needs the ASTNode as one of its parameters. Since we need to construct a new tableSpec using only the index table name, and we do not have a ASTNode for this, I need this constructor. If you have any other way in mind, please let me know. That would be helpful. bq. On 2011-07-28 21:40:30, John Sichi wrote: bq. > ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx.q, line 27 bq. > <https://reviews.apache.org/r/1194/diff/1/?file=27063#file27063line27> bq. > bq. > This should *not* be using the index, since the index is built on count(l_shipdate), and l_shipdate may contain nulls, whereas the query is referencing count(1), which is insensitive to nulls. Yes true. :( I have now changed the count(1) queries with count(l_shipdate) in ql_rewrite_gbtoidx.q file. Also, verified that the count(1) queries are not using the index. bq. On 2011-07-28 21:40:30, John Sichi wrote: bq. > ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx.q, line 61 bq. > <https://reviews.apache.org/r/1194/diff/1/?file=27063#file27063line61> bq. > bq. > Need additional tests to verify all the cases where the optimization should *not* be used: bq. > bq. > * when configuration disables it bq. > * when index partitions do not cover table partitions (I still don't see the code for this case) bq. > * ... all the other conditions checked for in the code ... bq. > Added new queries to verify that optimization is not used in case of: * when configuration disables it * ... all the other conditions checked for in the code ... About "* when index partitions do not cover table partitions ", still pending (also the code for it). I will upload the new patch once this is done. - Prajakta ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1194/#review1212 ----------------------------------------------------------- On 2011-07-26 14:44:01, Prajakta Kalmegh wrote: bq. bq. ----------------------------------------------------------- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1194/ bq. ----------------------------------------------------------- bq. bq. (Updated 2011-07-26 14:44:01) bq. bq. bq. Review request for hive and John Sichi. bq. bq. bq. Summary bq. ------- bq. bq. This patch has defined a new AggregateIndexHandler which is used to optimize the query plan for groupby queries. bq. bq. bq. This addresses bug HIVE-1694. bq. https://issues.apache.org/jira/browse/HIVE-1694 bq. bq. bq. Diffs bq. ----- bq. bq. common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b46976f bq. ql/src/java/org/apache/hadoop/hive/ql/index/AggregateIndexHandler.java PRE-CREATION bq. ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndex.java 591c9ff bq. ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 2ca63b3 bq. ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 590d69a bq. ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java PRE-CREATION bq. ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyProcFactory.java PRE-CREATION bq. ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteGBUsingIndex.java PRE-CREATION bq. ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteParseContextGenerator.java PRE-CREATION bq. ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndex.java PRE-CREATION bq. ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndexCtx.java PRE-CREATION bq. ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 77a6dc6 bq. ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx.q PRE-CREATION bq. ql/src/test/results/clientpositive/ql_rewrite_gbtoidx.q.out PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/1194/diff bq. bq. bq. Testing bq. ------- bq. bq. bq. Thanks, bq. bq. Prajakta bq. bq. > Accelerate GROUP BY execution using indexes > ------------------------------------------- > > Key: HIVE-1694 > URL: https://issues.apache.org/jira/browse/HIVE-1694 > Project: Hive > Issue Type: New Feature > Components: Indexing, Query Processor > Affects Versions: 0.7.0 > Reporter: Nikhil Deshpande > Assignee: Prajakta Kalmegh > Attachments: HIVE-1694.1.patch.txt, HIVE-1694.2.patch.txt, > HIVE-1694.3.patch.txt, HIVE-1694.4.patch, HIVE-1694_2010-10-28.diff, > demo_q1.hql, demo_q2.hql > > > The index building patch (Hive-417) is checked into trunk, this JIRA issue > tracks supporting indexes in Hive compiler & execution engine for SELECT > queries. > This is in ref. to John's comment at > https://issues.apache.org/jira/browse/HIVE-417?focusedCommentId=12884869&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12884869 > on creating separate JIRA issue for tracking index usage in optimizer & query > execution. > The aim of this effort is to use indexes to accelerate query execution (for > certain class of queries). E.g. > - Filters and range scans (already being worked on by He Yongqiang as part of > HIVE-417?) > - Joins (index based joins) > - Group By, Order By and other misc cases > The proposal is multi-step: > 1. Building index based operators, compiler and execution engine changes > 2. Optimizer enhancements (e.g. cost-based optimizer to compare and choose > between index scans, full table scans etc.) > This JIRA initially focuses on the first step. This JIRA is expected to hold > the information about index based plans & operator implementations for above > mentioned cases. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira