[ 
https://issues.apache.org/jira/browse/HIVE-29161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Seonggon Namgung updated HIVE-29161:
------------------------------------
    Summary: Correct the row count computation affected by Dynamic SemiJoin 
Reduction  (was: Update statistics correctly when applying the impact of 
Dynamic SemiJoin Reduction)

> Correct the row count computation affected by Dynamic SemiJoin Reduction
> ------------------------------------------------------------------------
>
>                 Key: HIVE-29161
>                 URL: https://issues.apache.org/jira/browse/HIVE-29161
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Seonggon Namgung
>            Assignee: Seonggon Namgung
>            Priority: Minor
>              Labels: pull-request-available
>
> During SemiJoin branch removal based on benefit, Hive temporarily updates the 
> statistics of the filter operator so that a later-visited SemiJoin branch is 
> aware of the effect of the surviving SemiJoin branch. The adjusted number of 
> rows is computed using the following code:
> {code:java}
> long newNumRows = (long) (1.0 - roi.reductionFactor) * 
> roi.filterStats.getNumRows(); {code}
> Due to the missing parentheses, Hive currently sets newNumRows either to 1 
> (adjusted from 0) or to the original value. This leads to incorrect decisions 
> in subsequent SemiJoin benefit computations and may result in suboptimal 
> query plans.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to