[jira] [Commented] (HIVE-18422) Vectorized input format should not be used when vectorized input format is excluded and row.serde is enabled

Matt McCline (JIRA) Mon, 22 Jan 2018 16:29:25 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-18422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16335213#comment-16335213
 ]


Matt McCline commented on HIVE-18422:
-------------------------------------

I had forgotten we have 2 excludes variables, sorry (I should have remembered 
since I reviewed the 2^nd^ variable change!). FULL OUTER MapJoin has made my 
mind mush. Ok, so I see what you are doing with this change and it makes sense.

 

hive.vectorized.use.vectorized.input.format

hive.vectorized.input.format.excludes

 

hive.vectorized.use.vector.serde.deserialize

 

hive.vectorized.use.row.serde.deserialize

hive.vectorized.row.serde.inputformat.excludes

 

+1 LGTM

> Vectorized input format should not be used when vectorized input format is 
> excluded and row.serde is enabled
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-18422
>                 URL: https://issues.apache.org/jira/browse/HIVE-18422
>             Project: Hive
>          Issue Type: Bug
>          Components: Vectorization
>    Affects Versions: 3.0.0, 2.4.0
>            Reporter: Vihang Karajgaonkar
>            Assignee: Vihang Karajgaonkar
>            Priority: Minor
>         Attachments: HIVE-18422.01.patch, HIVE-18422.02.patch
>
>
> HIVE-17534 introduced a config which gives a capability to exclude certain 
> inputformat from vectorized execution without affecting other input formats. 
> If an input format is excluded and row.serde is enabled at the same time, 
> vectorizer still sets the {{useVectorizedInputFormat}} to true which causes 
> Vectorized readers to be used in row.serde mode.
> In order to reproduce:
> {noformat}
> set hive.fetch.task.conversion=none;
> set hive.vectorized.use.row.serde.deserialize=true;
> set hive.vectorized.use.vector.serde.deserialize=true;
> set hive.vectorized.execution.enabled=true;
> set hive.vectorized.execution.reduce.enabled=true;
> set hive.vectorized.row.serde.inputformat.excludes=;
> -- SORT_QUERY_RESULTS
> -- exclude MapredParquetInputFormat from vectorization, this should cause 
> mapwork vectorization to be disabled
> set 
> hive.vectorized.input.format.excludes=org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat,org.apache.hadoop.hive.ql.io.orc.OrcInputFormat;
> set hive.vectorized.use.vectorized.input.format=true;
> create table orcTbl (t1 tinyint, t2 tinyint)
> stored as orc;
> insert into orcTbl values (54, 9), (-104, 25), (-112, 24);
> explain vectorization select t1, t2, (t1+t2) from orcTbl where (t1+t2) > 10;
> select t1, t2, (t1+t2) from orcTbl where (t1+t2) > 10;
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18422) Vectorized input format should not be used when vectorized input format is excluded and row.serde is enabled

Reply via email to