Seonggon Namgung created HIVE-28196:
---------------------------------------

             Summary: Preserve column stats when applying UDF upper/lower.
                 Key: HIVE-28196
                 URL: https://issues.apache.org/jira/browse/HIVE-28196
             Project: Hive
          Issue Type: Improvement
            Reporter: Seonggon Namgung
            Assignee: Seonggon Namgung


Current Hive re-estimates column stats (including avgColLen) when it encounters 
UDF.
In the case of upper and lower, Hive sets avgColLen to 
hive.stats.max.variable.length.
But these UDFs do not change column stats and the default value(100) is too 
high for string type key columns, on which upper/lower are usually applied.

This patch keeps input data's avgColLen after applying UDF upper/lower to make 
a better query plan.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to