[ 
https://issues.apache.org/jira/browse/HIVE-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14176506#comment-14176506
 ] 

Gunther Hagleitner commented on HIVE-8498:
------------------------------------------

Tagging afaik only comes into play only for demux/mux. It might be easier to 
fix the multi insert case, especially since I know the event broadcast is 
already working (and you would disable this). The plan for this multi-insert 
query should be something like:

ts -> fil[1] -> fs[1]
    -> fil[2] -> fs[2]
    -> fil[3] -> fs[3] 

The problem might be as simple as making sure the TS fowards to all it's 
children.
It might, however, also be a case of the vectorization code not converting 
operators correctly.

If it's simple, the best approach might be to put a fix for the multi-insert 
case, and disable correlation optimizer (tagging) when vectorization is on.

[~jnp] do you have any insights?

> Insert into table misses some rows when vectorization is enabled
> ----------------------------------------------------------------
>
>                 Key: HIVE-8498
>                 URL: https://issues.apache.org/jira/browse/HIVE-8498
>             Project: Hive
>          Issue Type: Bug
>          Components: Vectorization
>    Affects Versions: 0.14.0, 0.13.1
>            Reporter: Prasanth J
>            Assignee: Matt McCline
>            Priority: Critical
>              Labels: vectorization
>         Attachments: HIVE-8498.01.patch, HIVE-8498.02.patch
>
>
>  Following is a small reproducible case for the issue
> create table orc1
>   stored as orc
>   tblproperties("orc.compress"="ZLIB")
>   as
>     select rn
>     from
>     (
>       select cast(1 as int) as rn from src limit 1
>       union all
>       select cast(100 as int) as rn from src limit 1
>       union all
>       select cast(10000 as int) as rn from src limit 1
>     ) t;
> create table orc_rn1 (rn int);
> create table orc_rn2 (rn int);
> create table orc_rn3 (rn int);
> // These inserts should produce 3 rows but only 1 row is produced
> from orc1 a
> insert overwrite table orc_rn1 select a.* where a.rn < 100
> insert overwrite table orc_rn2 select a.* where a.rn >= 100 and a.rn < 1000
> insert overwrite table orc_rn3 select a.* where a.rn >= 1000;
> select * from orc_rn1
> union all
> select * from orc_rn2
> union all
> select * from orc_rn3;
> The expected output of the query is
> 1
> 100
> 10000
> But with vectorization enabled we get
> 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to