----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1516/#review1658 -----------------------------------------------------------
Still looking at the tests. Here are my partial comments. trunk/conf/hive-default.xml <https://reviews.apache.org/r/1516/#comment3701> There is another parameter hive.exec.compress.intermediate which controls whether to compress the intermediate data between MR jobs. Can you check if it's turned on by default? I think we should turned that on and do this change so that the unit tests are actually covering your new code path. trunk/conf/hive-default.xml <https://reviews.apache.org/r/1516/#comment3700> Please be more specific (an example would help) about when this codec is going to be used. - Ning On 2011-08-16 00:24:28, Kevin Wilfong wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/1516/ > ----------------------------------------------------------- > > (Updated 2011-08-16 00:24:28) > > > Review request for hive and Ning Zhang. > > > Summary > ------- > > I added a field to MapredWork and MapredLocalWork which indicates whether it > is intermediate or not. By intermediate, I mean that if the query is an > insert, there is at least one other map reduce task that is guaranteed to > happen before the move. If the query is not an insert, intermediate applies > to them all. I determine this by defaulting the flag to true, and setting it > to false when the tasks to move the data into a table or file are generated. > > If the work for a map reduce task (local or otherwise) is intermediate, then > we set the compression to be used on the output of the reduce to some > configured value, the default is LZO. > > > This addresses bug HIVE-2374. > https://issues.apache.org/jira/browse/HIVE-2374 > > > Diffs > ----- > > trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1157918 > trunk/conf/hive-default.xml 1157918 > trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1157918 > trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java > 1157918 > trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java > 1157918 > trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MapredLocalWork.java > 1157918 > trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 1157918 > trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java > 1157918 > > trunk/ql/src/test/org/apache/hadoop/hive/ql/hooks/VerifyIsIntermediateHook.java > PRE-CREATION > trunk/ql/src/test/queries/clientpositive/intermediate_compression.q > PRE-CREATION > trunk/ql/src/test/results/clientpositive/intermediate_compression.q.out > PRE-CREATION > > Diff: https://reviews.apache.org/r/1516/diff > > > Testing > ------- > > I added a test query and hook to verify that the is intermediate flag is set > properly in the MapredWork/MapredLocalWork. > > I also added a test to TestExecDriver which checks that the correct > compression is used on the output of the reduce for each value of the is > intermediate flag. > > > Thanks, > > Kevin > >