[ https://issues.apache.org/jira/browse/HIVE-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14176506#comment-14176506 ]
Gunther Hagleitner commented on HIVE-8498: ------------------------------------------ Tagging afaik only comes into play only for demux/mux. It might be easier to fix the multi insert case, especially since I know the event broadcast is already working (and you would disable this). The plan for this multi-insert query should be something like: ts -> fil[1] -> fs[1] -> fil[2] -> fs[2] -> fil[3] -> fs[3] The problem might be as simple as making sure the TS fowards to all it's children. It might, however, also be a case of the vectorization code not converting operators correctly. If it's simple, the best approach might be to put a fix for the multi-insert case, and disable correlation optimizer (tagging) when vectorization is on. [~jnp] do you have any insights? > Insert into table misses some rows when vectorization is enabled > ---------------------------------------------------------------- > > Key: HIVE-8498 > URL: https://issues.apache.org/jira/browse/HIVE-8498 > Project: Hive > Issue Type: Bug > Components: Vectorization > Affects Versions: 0.14.0, 0.13.1 > Reporter: Prasanth J > Assignee: Matt McCline > Priority: Critical > Labels: vectorization > Attachments: HIVE-8498.01.patch, HIVE-8498.02.patch > > > Following is a small reproducible case for the issue > create table orc1 > stored as orc > tblproperties("orc.compress"="ZLIB") > as > select rn > from > ( > select cast(1 as int) as rn from src limit 1 > union all > select cast(100 as int) as rn from src limit 1 > union all > select cast(10000 as int) as rn from src limit 1 > ) t; > create table orc_rn1 (rn int); > create table orc_rn2 (rn int); > create table orc_rn3 (rn int); > // These inserts should produce 3 rows but only 1 row is produced > from orc1 a > insert overwrite table orc_rn1 select a.* where a.rn < 100 > insert overwrite table orc_rn2 select a.* where a.rn >= 100 and a.rn < 1000 > insert overwrite table orc_rn3 select a.* where a.rn >= 1000; > select * from orc_rn1 > union all > select * from orc_rn2 > union all > select * from orc_rn3; > The expected output of the query is > 1 > 100 > 10000 > But with vectorization enabled we get > 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)