Rajkumar Singh created HIVE-20673:
-------------------------------------

             Summary: vectorized map join fail with Unexpected column vector 
type STRUCT.
                 Key: HIVE-20673
                 URL: https://issues.apache.org/jira/browse/HIVE-20673
             Project: Hive
          Issue Type: Bug
          Components: Hive, Transactions, Vectorization
    Affects Versions: 3.1.0
         Environment: hive-3, java-8
            Reporter: Rajkumar Singh


update query on ACID table fails with the following exception.
 
UPDATE census_clus SET name = 'updated name' where ssn=100 and   EXISTS (select 
distinct ssn from census where ssn=census_clus.ssn);

{code}
Caused by: java.lang.RuntimeException: Map operator initialization failed
        at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:354)
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266)
        ... 16 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected column 
vector type STRUCT
        at 
org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.init(VectorCopyRow.java:302)
        at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.initializeOp(VectorMapJoinCommonOperator.java:419)
        at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.initializeOp(VectorMapJoinGenerateResultOperator.java:115)
        at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
        at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:572)
        at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:524)
        at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
        at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:335)
{code}

STEPS TO REPRODUCE
{code}
create table census(
ssn int,
name string,
city string,
email string) 
row format delimited 
fields terminated by ',';

insert into census values(100,"raj","san jose","email");
create table census_clus(
ssn int,
name string,
city string,
email string) 
clustered by (ssn) into 4 buckets  stored as orc TBLPROPERTIES 
('transactional'='true');

insert into  table census_clus select *  from census;

UPDATE census_clus SET name = 'updated name' where ssn=100 and   EXISTS (select 
distinct ssn from census where ssn=census_clus.ssn);
{code}

looking at the exception it seems the join operator getting typeInfo 
incorrectly while doing join, _col6 seems to be of struct type.

{code}
2018-10-02 22:22:23,392 [INFO] [TezChild] |exec.CommonJoinOperator|: JOIN 
struct<_col2:string,_col3:string,_col6:struct<writeid:bigint,bucketid:int,rowid:bigint>>
 totalsz = 3

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to