[ 
https://issues.apache.org/jira/browse/HIVE-28642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Zhang updated HIVE-28642:
----------------------------
    Affects Version/s: 4.0.1

> column stats state unnecessarily created when 
> hive.stats.column.autogather=false 
> ---------------------------------------------------------------------------------
>
>                 Key: HIVE-28642
>                 URL: https://issues.apache.org/jira/browse/HIVE-28642
>             Project: Hive
>          Issue Type: Bug
>      Security Level: Public(Viewable by anyone) 
>          Components: Hive
>    Affects Versions: 3.1.3, 4.0.1
>            Reporter: Yi Zhang
>            Priority: Major
>
> when loadPartition, if it is a new partiton
> ```
> if (this.getConf().getBoolVar(HiveConf.ConfVars.HIVE_STATS_AUTOGATHER)) {
> StatsSetupConst.setStatsStateForCreateTable(newTPart.getParameters(),
> MetaStoreUtils.getColumnNames(tbl.getCols()), StatsSetupConst.TRUE);
> }
> ```
>  
> this creates a query string to metastore 
>  
> INSERT INTO `PARTITION_PARAMS` (`PARAM_VALUE`,`PART_ID`,`PARAM_KEY`) VALUES 
> (<'\{"BASIC_STATS":"true","COLUMN_STATS":{"col1":"true","col2":"true", 
> ....}}'>,<593>,<'COLUMN_STATS_ACCURATE'>)
>  
> if there are many columns, it can run into PARAM_VALUE  too long error. 
> though in hive-4, HIVE-20221 increased PARAM_VALUE width to blob, for hive-3, 
> the width is too small.
> besides, this seems unnecessary to set the column stats state when 
> hive.stats.column.autogather=false 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to