[ 
https://issues.apache.org/jira/browse/HIVE-24666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17271106#comment-17271106
 ] 

Zhihua Deng commented on HIVE-24666:
------------------------------------

Thanks much for the reply.

> So this might need a generic wrap PROJECTION with SelectColumnIsTrue as a 
>general case (not just for cast or just that one cast boolean).

I'm not sure I understand for this,  but put all non-boolean filter expressions 
casting at logical plan and move the vectorized UDFToBoolean to a standalone 
method.  The vectorization has implemented the constants, user customized 
functions,  columns that wrapping with SelectColumnIsTrue if use these to 
filter the rows.  Cloud you please put it a little bit more if I am wrong? 

> but it fixes only the specific issue by wrapping it with the filter (that 
>modifies the .selected vector) - the real issue is hiding somewhere else.

I think the cause is that the vectorized expressions of UDFToBoolean only have 
PROJECTION mode in them, as you have explained, so when we use it to filter 
rows, we should evaluate SelectColumnIsTrue on the results of cast before 
forwarding the batch to the next operation.

 

> Vectorized UDFToBoolean may unable to filter rows if input is string
> --------------------------------------------------------------------
>
>                 Key: HIVE-24666
>                 URL: https://issues.apache.org/jira/browse/HIVE-24666
>             Project: Hive
>          Issue Type: Bug
>          Components: Vectorization
>            Reporter: Zhihua Deng
>            Assignee: Zhihua Deng
>            Priority: Minor
>              Labels: pull-request-available
>         Attachments: HIVE-24666.2.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> If we use cast boolean in where conditions to filter rows,  in vectorization 
> execution the filter is unable to filter rows,  step to reproduce:
> {code:java}
> create table vtb (key string, value string);
> insert into table vtb values('0', 'val0'), ('false', 'valfalse'),('off', 
> 'valoff'),('no','valno'),('vk', 'valvk');
> select distinct value from vtb where cast(key as boolean); {code}
> It's seems we don't generate a SelectColumnIsTrue to filter the rows if the 
> casted type is string:
>  
> https://github.com/apache/hive/blob/ff6f3565e50148b7bcfbcf19b970379f2bd59290/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java#L2995-L2996



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to