[ 
https://issues.apache.org/jira/browse/HIVE-17917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saurabh Seth updated HIVE-17917:
--------------------------------
    Attachment: HIVE-17917.patch
        Status: Patch Available  (was: Open)

Moved the computation of the offset and bucket so that it's done once per file 
when the splits are generated. The result is then passed along in the OrcSplit.

This is done only for vectorized execution mode because the non vector mode 
readers handle this themselves - perhaps that can be moved here as well.

> VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization
> -------------------------------------------------------------------
>
>                 Key: HIVE-17917
>                 URL: https://issues.apache.org/jira/browse/HIVE-17917
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Transactions
>    Affects Versions: 3.0.0
>            Reporter: Eugene Koifman
>            Assignee: Saurabh Seth
>            Priority: Minor
>         Attachments: HIVE-17917.patch
>
>
> VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization() 
> computation is currently (after HIVE-17458) is done once per split.  It could 
> instead be done once per file (since the result is the same for each split of 
> the same file) and passed along in OrcSplit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to