[ https://issues.apache.org/jira/browse/HIVE-24666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17271106#comment-17271106 ]
Zhihua Deng commented on HIVE-24666: ------------------------------------ Thanks much for the reply. > So this might need a generic wrap PROJECTION with SelectColumnIsTrue as a >general case (not just for cast or just that one cast boolean). I'm not sure I understand for this, but put all non-boolean filter expressions casting at logical plan and move the vectorized UDFToBoolean to a standalone method. The vectorization has implemented the constants, user customized functions, columns that wrapping with SelectColumnIsTrue if use these to filter the rows. Cloud you please put it a little bit more if I am wrong? > but it fixes only the specific issue by wrapping it with the filter (that >modifies the .selected vector) - the real issue is hiding somewhere else. I think the cause is that the vectorized expressions of UDFToBoolean only have PROJECTION mode in them, as you have explained, so when we use it to filter rows, we should evaluate SelectColumnIsTrue on the results of cast before forwarding the batch to the next operation. > Vectorized UDFToBoolean may unable to filter rows if input is string > -------------------------------------------------------------------- > > Key: HIVE-24666 > URL: https://issues.apache.org/jira/browse/HIVE-24666 > Project: Hive > Issue Type: Bug > Components: Vectorization > Reporter: Zhihua Deng > Assignee: Zhihua Deng > Priority: Minor > Labels: pull-request-available > Attachments: HIVE-24666.2.patch > > Time Spent: 10m > Remaining Estimate: 0h > > If we use cast boolean in where conditions to filter rows, in vectorization > execution the filter is unable to filter rows, step to reproduce: > {code:java} > create table vtb (key string, value string); > insert into table vtb values('0', 'val0'), ('false', 'valfalse'),('off', > 'valoff'),('no','valno'),('vk', 'valvk'); > select distinct value from vtb where cast(key as boolean); {code} > It's seems we don't generate a SelectColumnIsTrue to filter the rows if the > casted type is string: > > https://github.com/apache/hive/blob/ff6f3565e50148b7bcfbcf19b970379f2bd59290/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java#L2995-L2996 -- This message was sent by Atlassian Jira (v8.3.4#803005)