[ 
https://issues.apache.org/jira/browse/HIVE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091852#comment-13091852
 ] 

jirapos...@reviews.apache.org commented on HIVE-2374:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1516/#review1658
-----------------------------------------------------------


Still looking at the tests. Here are my partial comments.


trunk/conf/hive-default.xml
<https://reviews.apache.org/r/1516/#comment3701>

    There is another parameter hive.exec.compress.intermediate which controls 
whether to compress the intermediate data between MR jobs. Can you check if 
it's turned on by default? I think we should turned that on and do this change 
so that the unit tests are actually covering your new code path. 



trunk/conf/hive-default.xml
<https://reviews.apache.org/r/1516/#comment3700>

    Please be more specific (an example would help) about when this codec is 
going to be used. 


- Ning


On 2011-08-16 00:24:28, Kevin Wilfong wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1516/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-08-16 00:24:28)
bq.  
bq.  
bq.  Review request for hive and Ning Zhang.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  I added a field to MapredWork and MapredLocalWork which indicates whether 
it is intermediate or not.  By intermediate, I mean that if the query is an 
insert, there is at least one other map reduce task that is guaranteed to 
happen before the move.  If the query is not an insert, intermediate applies to 
them all.  I determine this by defaulting the flag to true, and setting it to 
false when the tasks to move the data into a table or file are generated.
bq.  
bq.  If the work for a map reduce task (local or otherwise) is intermediate, 
then we set the compression to be used on the output of the reduce to some 
configured value, the default is LZO.
bq.  
bq.  
bq.  This addresses bug HIVE-2374.
bq.      https://issues.apache.org/jira/browse/HIVE-2374
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1157918 
bq.    trunk/conf/hive-default.xml 1157918 
bq.    trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1157918 
bq.    trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java 
1157918 
bq.    
trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java 
1157918 
bq.    trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MapredLocalWork.java 
1157918 
bq.    trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 1157918 
bq.    trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 
1157918 
bq.    
trunk/ql/src/test/org/apache/hadoop/hive/ql/hooks/VerifyIsIntermediateHook.java 
PRE-CREATION 
bq.    trunk/ql/src/test/queries/clientpositive/intermediate_compression.q 
PRE-CREATION 
bq.    trunk/ql/src/test/results/clientpositive/intermediate_compression.q.out 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/1516/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  I added a test query and hook to verify that the is intermediate flag is 
set properly in the MapredWork/MapredLocalWork.
bq.  
bq.  I also added a test to TestExecDriver which checks that the correct 
compression is used on the output of the reduce for each value of the is 
intermediate flag.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Kevin
bq.  
bq.



> Make compression used between map reduce tasks configurable.
> ------------------------------------------------------------
>
>                 Key: HIVE-2374
>                 URL: https://issues.apache.org/jira/browse/HIVE-2374
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-2374.1.patch.txt
>
>
> We want to allow the compression between map reduce tasks to be configurable, 
> similar to the way it is between the map and reduce jobs is configurable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to