[ https://issues.apache.org/jira/browse/HIVE-25779?focusedWorklogId=789383&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-789383 ]
ASF GitHub Bot logged work on HIVE-25779: ----------------------------------------- Author: ASF GitHub Bot Created on: 11/Jul/22 07:14 Start Date: 11/Jul/22 07:14 Worklog Time Spent: 10m Work Description: dengzhhu653 commented on code in PR #3221: URL: https://github.com/apache/hive/pull/3221#discussion_r917610133 ########## standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java: ########## @@ -1276,6 +1276,9 @@ public enum ConfVars { "hive.metastore.serdes.without.from.deserializer", "org.apache.iceberg.mr.hive.HiveIcebergSerDe", "SerDes which are providing the schema but do not need the 'from deserializer' comment for the columns."), + USE_TABLE_SERDES("metastore.use.table.serdes", Review Comment: not sure if we can turn on the feature by default, since this is a good performance improvement. Issue Time Tracking ------------------- Worklog Id: (was: 789383) Time Spent: 2h 40m (was: 2.5h) > Deduplicate SerDe Info > ---------------------- > > Key: HIVE-25779 > URL: https://issues.apache.org/jira/browse/HIVE-25779 > Project: Hive > Issue Type: New Feature > Components: Standalone Metastore > Reporter: Yu-Wen Lai > Assignee: Yu-Wen Lai > Priority: Major > Labels: pull-request-available > Time Spent: 2h 40m > Remaining Estimate: 0h > > The proposal is that we can reuse serde info as how we reuse column > descriptors. (HIVE-2246) > Currently, we store the metadata for partitions as PARTITIONS (N partitions) > -> SDS (N locations) -> SERDES (N entries). However, all the SERDES for the > partitions in a table are the same if we don't explicitly specify it. That > is, each storage descriptor has a associated and exclusive serde info, but > the partitions' serde infos are mostly just the same as the table's. By > reusing the serde info, we can save some database storage and enhance the > query performance from HMS to the backend database. > For backward compatibility, we also need to introduce a config for this > feature because there will be issues if HMS old instance and HMS new instance > with this feature are running together. With this feature, we will need to > check if others reference the serdes before deleting it, but the old instance > will just delete it. > The other thing we need to take care of is custom serdes. If a partition's > serde is modified, we need to create a new record in SERDES so that we don't > interfere other partitions. -- This message was sent by Atlassian Jira (v8.20.10#820010)