[ https://issues.apache.org/jira/browse/HIVE-24367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Harshit Gupta reassigned HIVE-24367: ------------------------------------ Assignee: Harshit Gupta > Explore whether HiveAlterHandler::alterTable can be optimised for > non-partitioned tables > ---------------------------------------------------------------------------------------- > > Key: HIVE-24367 > URL: https://issues.apache.org/jira/browse/HIVE-24367 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 > Reporter: Rajesh Balamohan > Assignee: Harshit Gupta > Priority: Major > > {color:#222222}Writing lots of delta in non-partitioned table creates runtime > issues, when lot of delta folders are present.{color} > {color:#222222} {color} > {color:#222222}Following code in HiveAlterHandler is invoked for every insert > operation. It computes {{{color} > {color:#222222}updateTableStatsSlow}} for every insert causing runtime > delays.{color} > {color:#222222} {color} > {noformat} > if (MetaStoreUtils.requireCalStats(null, null, newt, environmentContext) && > !isPartitionedTable) { > Database db = msdb.getDatabase(catName, newDbName); > assert(isReplicated == HiveMetaStore.HMSHandler.isDbReplicationTarget(db)); > // Update table stats. For partitioned table, we update stats in > alterPartition() > MetaStoreUtils.updateTableStatsSlow(db, newt, wh, false, true, > environmentContext); > } > {noformat} > {color:#222222}It would be good to explore whether only the newly added delta > can be listed for computing stats. This would avoid huge listing call during > stats collection.{color} > {color:#222222}e.g queries to repro{color} > {noformat} > CREATE TABLE IF NOT EXISTS test (name String, value int); > INSERT INTO test VALUES('K1',1); > INSERT INTO test VALUES('K2',2); > .. > .. > .. > INSERT INTO test VALUES('K20000',2) > {noformat} > -- This message was sent by Atlassian Jira (v8.3.4#803005)