[ https://issues.apache.org/jira/browse/HIVE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13638808#comment-13638808 ]
Phabricator commented on HIVE-4377: ----------------------------------- navis has commented on the revision "HIVE-4377 [jira] Add more comment to https://reviews.facebook.net/D1209 (HIVE-2340)". INLINE COMMENTS ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java:209 It was implemented as your suggestion at first but it was very confusing with many redundant codes(There are seven possible cases sharing common rule). But if you prefer, I'll update patch. ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java:251 processOrderBy? ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java:254 ProcessGroupBy? ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java:264 Will be moved to JoinReducerProc. ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java:512 I was thinking of OperatorUtils or someting. Methods like this would be made continuously. ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java:685 can be null if there exists operator like ScriptOperator between two RSs. ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java:690 If there is difference in key/partition/sort-order in common part of two RSs, it's not possible to merge. I'll add comment for that. ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java:395 I'll try. REVISION DETAIL https://reviews.facebook.net/D10377 To: JIRA, navis Cc: njain > Add more comment to https://reviews.facebook.net/D1209 (HIVE-2340) > ------------------------------------------------------------------ > > Key: HIVE-4377 > URL: https://issues.apache.org/jira/browse/HIVE-4377 > Project: Hive > Issue Type: Bug > Components: Query Processor > Reporter: Gang Tim Liu > Assignee: Navis > Attachments: HIVE-4377.D10377.1.patch > > > thanks a lot for addressing optimization in HIVE-2340. Awesome! > Since we are developing at a very fast pace, it would be really useful to > think about maintainability and testing of the large codebase. Highlights > which are applicable for D1209: > 1. Javadoc for all public/private functions, except for > setters/getters. For any complex function, clear examples (input/output) > would really help. > 2. Specially, for query optimizations, it might be a good idea to have > a simple working query at the top, and the expected changes. For e.g.. > The operator tree for that query at each step, or a detailed explanation > at the top. > 3. If possible, the test name (.q file) where the function is being > invoked, or the query which would potentially test that scenario, if it > is a query processor change. > 4. Comments in each test (.q file) that should include the jira > number, what is it trying to test. Assumptions about each query. > 5. Reduce the output for each test whenever query is outputting more > than 10 results, it should have a reason. Otherwise, each query result > should be bounded by 10 rows. > thanks a lot -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira