[jira] [Commented] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches

Eric Hanson (JIRA) Wed, 21 May 2014 09:43:06 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004882#comment-14004882
 ]


Eric Hanson commented on HIVE-7105:
-----------------------------------

I agree with Remus. If you do want to get good performance with vectorization 
on the reduce side, you'll need to think carefully about how you can 
efficiently create full VectorizedRowBatches. Single-row or small 
VectorizedRowBatches will not give performance gains. Also, if it is expensive 
to load rows into the batches on the reduce side, that could dominate total 
runtime.

> Enable ReduceRecordProcessor to generate VectorizedRowBatches
> -------------------------------------------------------------
>
>                 Key: HIVE-7105
>                 URL: https://issues.apache.org/jira/browse/HIVE-7105
>             Project: Hive
>          Issue Type: Bug
>          Components: Vectorization
>            Reporter: Rajesh Balamohan
>            Assignee: Jitendra Nath Pandey
>         Attachments: HIVE-7105.1.patch
>
>
> Currently, ReduceRecordProcessor sends one key,value pair at a time to its 
> operator pipeline.  It would be beneficial to send VectorizedRowBatch to 
> downstream operators. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches

Reply via email to