[ https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15816767#comment-15816767 ]
Gopal V commented on HIVE-15573: -------------------------------- The timings for Map 1 went from 85s -> 13s, when vectorization (incorrectly bucketed) applied to this ReduceSink. > Vectorization: ACID shuffle ReduceSink is not specialized > ---------------------------------------------------------- > > Key: HIVE-15573 > URL: https://issues.apache.org/jira/browse/HIVE-15573 > Project: Hive > Issue Type: Improvement > Components: Transactions, Vectorization > Affects Versions: 2.2.0 > Reporter: Gopal V > Attachments: screenshot-1.png > > > The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing > requirements demanding the writable hashcode for the shuffles. > {code} > boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM); > if (!useUniformHash) { > return false; > } > {code} > This check protects the fast ReduceSink ops from being used in ACID inserts. > A specialized case for the following pattern will make ACID insert much > faster. > {code} > Reduce Output Operator > sort order: > Map-reduce partition columns: _col0 (type: bigint) > value expressions: .... > {code} > !screenshot-1.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)