[ https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004882#comment-14004882 ]
Eric Hanson commented on HIVE-7105: ----------------------------------- I agree with Remus. If you do want to get good performance with vectorization on the reduce side, you'll need to think carefully about how you can efficiently create full VectorizedRowBatches. Single-row or small VectorizedRowBatches will not give performance gains. Also, if it is expensive to load rows into the batches on the reduce side, that could dominate total runtime. > Enable ReduceRecordProcessor to generate VectorizedRowBatches > ------------------------------------------------------------- > > Key: HIVE-7105 > URL: https://issues.apache.org/jira/browse/HIVE-7105 > Project: Hive > Issue Type: Bug > Components: Vectorization > Reporter: Rajesh Balamohan > Assignee: Jitendra Nath Pandey > Attachments: HIVE-7105.1.patch > > > Currently, ReduceRecordProcessor sends one key,value pair at a time to its > operator pipeline. It would be beneficial to send VectorizedRowBatch to > downstream operators. -- This message was sent by Atlassian JIRA (v6.2#6252)