[ 
https://issues.apache.org/jira/browse/HIVE-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14176491#comment-14176491
 ] 

Matt McCline commented on HIVE-8498:
------------------------------------

Jitendra [~jnp] told me a while ago the vectorization logic doesn't support / 
wasn't architected for tagging/multiple children.  Part of this may be due to 
do with the shadow VectorizationContext data structures that track which 
columns of vectorized row batches for each vectorized operator.

This JIRA is about multi insert queries basic functionality not working -- only 
rows from first inset being processed.  I don't know if the solution is 
difficult or not.


> Insert into table misses some rows when vectorization is enabled
> ----------------------------------------------------------------
>
>                 Key: HIVE-8498
>                 URL: https://issues.apache.org/jira/browse/HIVE-8498
>             Project: Hive
>          Issue Type: Bug
>          Components: Vectorization
>    Affects Versions: 0.14.0, 0.13.1
>            Reporter: Prasanth J
>            Assignee: Matt McCline
>            Priority: Critical
>              Labels: vectorization
>         Attachments: HIVE-8498.01.patch, HIVE-8498.02.patch
>
>
>  Following is a small reproducible case for the issue
> create table orc1
>   stored as orc
>   tblproperties("orc.compress"="ZLIB")
>   as
>     select rn
>     from
>     (
>       select cast(1 as int) as rn from src limit 1
>       union all
>       select cast(100 as int) as rn from src limit 1
>       union all
>       select cast(10000 as int) as rn from src limit 1
>     ) t;
> create table orc_rn1 (rn int);
> create table orc_rn2 (rn int);
> create table orc_rn3 (rn int);
> // These inserts should produce 3 rows but only 1 row is produced
> from orc1 a
> insert overwrite table orc_rn1 select a.* where a.rn < 100
> insert overwrite table orc_rn2 select a.* where a.rn >= 100 and a.rn < 1000
> insert overwrite table orc_rn3 select a.* where a.rn >= 1000;
> select * from orc_rn1
> union all
> select * from orc_rn2
> union all
> select * from orc_rn3;
> The expected output of the query is
> 1
> 100
> 10000
> But with vectorization enabled we get
> 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to