[ 
https://issues.apache.org/jira/browse/HIVE-24240?focusedWorklogId=552960&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-552960
 ]

ASF GitHub Bot logged work on HIVE-24240:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 16/Feb/21 13:31
            Start Date: 16/Feb/21 13:31
    Worklog Time Spent: 10m 
      Work Description: okumin opened a new pull request #1984:
URL: https://github.com/apache/hive/pull/1984


   https://issues.apache.org/jira/browse/HIVE-24240
   
   ### What changes were proposed in this pull request?
   Fix incorrect computations to estimate UDTF size.
   
   - Put 1 when numRows becomes zero because Hive expects it will be non-zero 
in regular cases
   - Use `StatsUtils .scaleColStatistics` to update col stats so as to update # 
of distinct values
   - Wrap the final stats with `applyRuntimeStats`
   
   This is a follow-up of https://github.com/apache/hive/pull/1531.
   
   ### Why are the changes needed?
   This PR would help Hive to compute more precise stats for UDTF.
   
   ### Does this PR introduce _any_ user-facing change?
   Compatible from the point of view of users.
   
   ### How was this patch tested?
   Revised one unit test.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 552960)
    Remaining Estimate: 0h
            Time Spent: 10m

> Implement missing features in UDTFStatsRule
> -------------------------------------------
>
>                 Key: HIVE-24240
>                 URL: https://issues.apache.org/jira/browse/HIVE-24240
>             Project: Hive
>          Issue Type: Improvement
>    Affects Versions: 4.0.0
>            Reporter: okumin
>            Assignee: okumin
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add the following steps.
>  * Handle the case in which the num row will be zero
>  * Compute runtime stats in case of a re-execution



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to