[ 
https://issues.apache.org/jira/browse/HIVE-8433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-8433:
-----------------------------------
    Attachment: HIVE-8433.patch

This patch makes sure there are no dups in row resolver and also adds early 
schema check, so CBO is disabled in such cases as per previous check that was 
erroneously passing in this case.
Duplicates would have to be handled to fix this properly...

> CBO loses a column during AST conversion
> ----------------------------------------
>
>                 Key: HIVE-8433
>                 URL: https://issues.apache.org/jira/browse/HIVE-8433
>             Project: Hive
>          Issue Type: Bug
>          Components: CBO
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>            Priority: Critical
>         Attachments: HIVE-8433.patch
>
>
> {noformat}
> SELECT
>   CAST(value AS BINARY),
>   value
> FROM src
> ORDER BY value
> LIMIT 100
> {noformat}
> returns only one column.
> Final CBO plan is
> {noformat}
>   HiveSortRel(sort0=[$1], dir0=[ASC]): rowcount = 500.0, cumulative cost = 
> {24858.432393688767 rows, 500.0 cpu, 0.0 io}, id = 44
>     HiveProjectRel(value=[CAST($0):BINARY(2147483647) NOT NULL], 
> value1=[$0]): rowcount = 500.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 
> io}, id = 42
>       HiveProjectRel(value=[$1]): rowcount = 500.0, cumulative cost = {0.0 
> rows, 0.0 cpu, 0.0 io}, id = 40
>         HiveTableScanRel(table=[[default.src]]): rowcount = 500.0, cumulative 
> cost = {0}, id = 0
> {noformat}
> but the resulting AST has only one column. Must be some bug in conversion, 
> probably related to the name collision in the schema, judging by the alias of 
> the column for the binary-cast value in the AST
> {noformat} 
> TOK_QUERY
>    TOK_FROM
>       TOK_SUBQUERY
>          TOK_QUERY
>             TOK_FROM
>                TOK_TABREF
>                   TOK_TABNAME
>                      default
>                      src
>                   src
>             TOK_INSERT
>                TOK_DESTINATION
>                   TOK_DIR
>                      TOK_TMP_FILE
>                TOK_SELECT
>                   TOK_SELEXPR
>                      .
>                         TOK_TABLE_OR_COL
>                            src
>                         value
>                      value
>          $hdt$_0
>    TOK_INSERT
>       TOK_DESTINATION
>          TOK_DIR
>             TOK_TMP_FILE
>       TOK_SELECT
>          TOK_SELEXPR
>             TOK_FUNCTION
>                TOK_BINARY
>                .
>                   TOK_TABLE_OR_COL
>                      $hdt$_0
>                   value
>             value
>       TOK_ORDERBY
>          TOK_TABSORTCOLNAMEASC
>             TOK_TABLE_OR_COL
>                value
>       TOK_LIMIT
>          100
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to