[ 
https://issues.apache.org/jira/browse/HIVE-12435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15049858#comment-15049858
 ] 

Matt McCline commented on HIVE-12435:
-------------------------------------

The vectorized row batch output from the VectorSelectOperator is:

{code}
./target/tmp/log/hive.log:2015-12-09T17:58:25,452 INFO  [LocalJobRunner Map 
Task Executor #0[]]: vector.VectorizedBatchUtil 
(VectorizedBatchUtil.java:debugDisplayOneRow(694)) - VectorSelectOperator 
exit... row 0 (col 0) bytes: 'key1' (col 1) 1 
./target/tmp/log/hive.log:2015-12-09T17:58:25,452 INFO  [LocalJobRunner Map 
Task Executor #0[]]: vector.VectorizedBatchUtil 
(VectorizedBatchUtil.java:debugDisplayOneRow(694)) - VectorSelectOperator 
exit... row 1 (col 0) bytes: 'key2' (col 1) 0 
./target/tmp/log/hive.log:2015-12-09T17:58:25,452 INFO  [LocalJobRunner Map 
Task Executor #0[]]: vector.VectorizedBatchUtil 
(VectorizedBatchUtil.java:debugDisplayOneRow(694)) - VectorSelectOperator 
exit... row 2 (col 0) bytes: 'key3' (col 1) NULL 
./target/tmp/log/hive.log:2015-12-09T17:58:25,452 INFO  [LocalJobRunner Map 
Task Executor #0[]]: vector.VectorizedBatchUtil 
(VectorizedBatchUtil.java:debugDisplayOneRow(694)) - VectorSelectOperator 
exit... row 3 (col 0) bytes: 'key4' (col 1) 0 
./target/tmp/log/hive.log:2015-12-09T17:58:25,452 INFO  [LocalJobRunner Map 
Task Executor #0[]]: vector.VectorizedBatchUtil 
(VectorizedBatchUtil.java:debugDisplayOneRow(694)) - VectorSelectOperator 
exit... row 4 (col 0) bytes: 'key5' (col 1) NULL 
{code}

and given the input this looks correct:

{code}
key1    true
key2    false
key3    NULL
key4    false
key5    NULL
{code}

> SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and 
> vectorization is enabled.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-12435
>                 URL: https://issues.apache.org/jira/browse/HIVE-12435
>             Project: Hive
>          Issue Type: Bug
>          Components: Vectorization
>    Affects Versions: 2.0.0
>            Reporter: Takahiko Saito
>            Assignee: Matt McCline
>            Priority: Critical
>         Attachments: vector_select_null2.q
>
>
> Run the following query:
> {noformat}
> create table count_case_groupby (key string, bool boolean) STORED AS orc;
> insert into table count_case_groupby values ('key1', true),('key2', 
> false),('key3', NULL),('key4', false),('key5',NULL);
> {noformat}
> The table contains the following:
> {noformat}
> key1  true
> key2  false
> key3  NULL
> key4  false
> key5  NULL
> {noformat}
> The below query returns:
> {noformat}
> SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) 
> AS cnt_bool0_ok FROM count_case_groupby GROUP BY key;
> key1  1
> key2  1
> key3  1
> key4  1
> key5  1
> {noformat}
> while it expects the following results:
> {noformat}
> key1  1
> key2  1
> key3  0
> key4  1
> key5  0
> {noformat}
> The query works with hive ver 1.2. Also it works when a table is not orc 
> format.
> Also even if it's an orc table, when vectorization is disabled, the query 
> works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to