Prasanth Jayachandran created HIVE-11477:
--------------------------------------------

             Summary: CBO inserts a UDF cast for integer type promotion
                 Key: HIVE-11477
                 URL: https://issues.apache.org/jira/browse/HIVE-11477
             Project: Hive
          Issue Type: Bug
    Affects Versions: 2.0.0
            Reporter: Prasanth Jayachandran
            Assignee: Pengcheng Xiong


When CBO is enabled, filters which compares tinyint, smallint columns with 
constant integer types will insert a UDFToInteger cast for the columns. When 
CBO is disabled, there is no such UDF. This behaviour breaks ORC predicate 
pushdown feature as ORC ignores UDFs in the filters.

In the following examples column t is tinyint
{code:title=Explain for select count(*) from orc_ppd where t < -127; (CBO OFF)}
Filter Operator [FIL_9]
                           predicate:(t = 125) (type: boolean)
                           Statistics:Num rows: 1050 Data size: 611757 Basic 
stats: COMPLETE Column stats: NONE
                           TableScan [TS_0]
                              alias:orc_ppd
                              Statistics:Num rows: 2100 Data size: 1223514 
Basic stats: COMPLETE Column stats: NONE
{code}

{code:title=Explain for select count(*) from orc_ppd where t < -127; (CBO ON)}
Filter Operator [FIL_10]
                           predicate:(UDFToInteger(t) < -127) (type: boolean)
                           Statistics:Num rows: 700 Data size: 407838 Basic 
stats: COMPLETE Column stats: NONE
                           TableScan [TS_0]
                              alias:orc_ppd
                              Statistics:Num rows: 2100 Data size: 1223514 
Basic stats: COMPLETE Column stats: NONE
{code}

CBO does not insert such cast for non-negative numbers
{code:title=Explain for select count(*) from orc_ppd where t < 127; (CBO ON)}
Filter Operator [FIL_10]
                           predicate:(t < 127) (type: boolean)
                           Statistics:Num rows: 700 Data size: 407838 Basic 
stats: COMPLETE Column stats: NONE
                           TableScan [TS_0]
                              alias:orc_ppd
                              Statistics:Num rows: 2100 Data size: 1223514 
Basic stats: COMPLETE Column stats: NONE
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to