[ 
https://issues.apache.org/jira/browse/HIVE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085454#comment-13085454
 ] 

Kevin Wilfong commented on HIVE-2374:
-------------------------------------

Running a test query involving a table of ~30 mb joining it with itself on a 
primary key and then using a group by on a randomly generated value, we saw a 
27% decrease in the runtime of the join map reduce task, with a 17% increase in 
the size of the output, and a 5% decrease in the runtime of the group by map 
reduce task, with a 264% increase in the size of the output.

Note that the original query (compression not configurable) the time was 72.8 
sec for the join map reduce task with 12.8 mb of output, and 28.4 sec for the 
group by map reduce task with 14 kb of output.

> Make compression used between map reduce tasks configurable.
> ------------------------------------------------------------
>
>                 Key: HIVE-2374
>                 URL: https://issues.apache.org/jira/browse/HIVE-2374
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-2374.1.patch.txt
>
>
> We want to allow the compression between map reduce tasks to be configurable, 
> similar to the way it is between the map and reduce jobs is configurable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to