[ https://issues.apache.org/jira/browse/HIVE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085454#comment-13085454 ]
Kevin Wilfong commented on HIVE-2374: ------------------------------------- Running a test query involving a table of ~30 mb joining it with itself on a primary key and then using a group by on a randomly generated value, we saw a 27% decrease in the runtime of the join map reduce task, with a 17% increase in the size of the output, and a 5% decrease in the runtime of the group by map reduce task, with a 264% increase in the size of the output. Note that the original query (compression not configurable) the time was 72.8 sec for the join map reduce task with 12.8 mb of output, and 28.4 sec for the group by map reduce task with 14 kb of output. > Make compression used between map reduce tasks configurable. > ------------------------------------------------------------ > > Key: HIVE-2374 > URL: https://issues.apache.org/jira/browse/HIVE-2374 > Project: Hive > Issue Type: Improvement > Reporter: Kevin Wilfong > Assignee: Kevin Wilfong > Attachments: HIVE-2374.1.patch.txt > > > We want to allow the compression between map reduce tasks to be configurable, > similar to the way it is between the map and reduce jobs is configurable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira