[ 
https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-12631:
------------------------------
    Attachment: HIVE-12631.11.patch

This 11th patch has two major changes. The first one is the new ORC ACID row 
batch encoded data consumer. It adds the vectorized ORC ACID row batch reader 
in LLAP, which is very performant for LLAP ACID. The second one is the reader 
generalization in the ORC raw record merger. The ACID logic now can work with 
more readers, rather than ORC reader only.

This patch enables following works in other issues;
# Introducing the LLAP record reader in the ORC raw record merger to minimize 
non-LLAP reads
# Replacing BitSet objects with integer arrays for more performance
# Adding the vectorized ORC ACID row reader in LLAP.

> LLAP: support ORC ACID tables
> -----------------------------
>
>                 Key: HIVE-12631
>                 URL: https://issues.apache.org/jira/browse/HIVE-12631
>             Project: Hive
>          Issue Type: Bug
>          Components: llap, Transactions
>            Reporter: Sergey Shelukhin
>            Assignee: Teddy Choi
>         Attachments: HIVE-12631.10.patch, HIVE-12631.10.patch, 
> HIVE-12631.11.patch, HIVE-12631.1.patch, HIVE-12631.2.patch, 
> HIVE-12631.3.patch, HIVE-12631.4.patch, HIVE-12631.5.patch, 
> HIVE-12631.6.patch, HIVE-12631.7.patch, HIVE-12631.8.patch, 
> HIVE-12631.8.patch, HIVE-12631.9.patch
>
>
> LLAP uses a completely separate read path in ORC to allow for caching and 
> parallelization of reads and processing. This path does not support ACID. As 
> far as I remember ACID logic is embedded inside ORC format; we need to 
> refactor it to be on top of some interface, if practical; or just port it to 
> LLAP read path.
> Another consideration is how the logic will work with cache. The cache is 
> currently low-level (CB-level in ORC), so we could just use it to read bases 
> and deltas (deltas should be cached with higher priority) and merge as usual. 
> We could also cache merged representation in future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to