[ 
https://issues.apache.org/jira/browse/HIVE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14392336#comment-14392336
 ] 

Gopal V commented on HIVE-9645:
-------------------------------

[~ashutoshc]: The latest patch is still breaking type-checks all over the place 
when I test it.

We can start using VOID types across the board, but it would effectively drop 
all compatibility for hive-1.0 UDFs & UDAFs.

{code}
explain select * from
(select trim(concat("", ss_sold_date_sk)) from store_sales a
UNION ALL
select trim(concat("", ss_sold_date_sk)) from store_sales b where 
ss_sold_date_sk is null)
x
limit 2
;
{code}

Every user-deployed UDF out there which is strict-typed would need a rewrite to 
support this patch & the type-shifting has the ability to break vectorization 
on the reducer side, because the key types won't match between ReduceSinks on a 
JOIN.

To keep UDF compatibility, you have to resolve the {{ss_sold_date_sk}} to a 
{{cast(null as bigint)}} and refuse to constant-fold casts to VOID, instead 
generating a {{new JavaConstantLongObjectInspector(null);}} its place.

That seems to be a good way to keep the schema from changing, but I will defer 
to [~jpullokkaran] on how the CBO layer will treat this case, but UDF compat 
might be worth considering now that we're already 1.0.

> Constant folding case NULL equality
> -----------------------------------
>
>                 Key: HIVE-9645
>                 URL: https://issues.apache.org/jira/browse/HIVE-9645
>             Project: Hive
>          Issue Type: Bug
>          Components: Logical Optimizer
>    Affects Versions: 0.14.0, 1.0.0, 1.1.0
>            Reporter: Gopal V
>            Assignee: Ashutosh Chauhan
>         Attachments: HIVE-9645.1.patch, HIVE-9645.2.patch, HIVE-9645.3.patch, 
> HIVE-9645.patch
>
>
> Hive logical optimizer does not follow the Null scan codepath when 
> encountering a NULL = 1;
> NULL = 1 is not evaluated as false in the constant propogation implementation.
> {code}
> hive> explain select count(1) from store_sales where null=1;
> ...
>              TableScan
>                   alias: store_sales
>                   filterExpr: (null = 1) (type: boolean)
>                   Statistics: Num rows: 550076554 Data size: 49570324480 
> Basic stats: COMPLETE Column stats: COMPLETE
>                   Filter Operator
>                     predicate: (null = 1) (type: boolean)
>                     Statistics: Num rows: 275038277 Data size: 0 Basic stats: 
> PARTIAL Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to