[ https://issues.apache.org/jira/browse/HIVE-25779?focusedWorklogId=787487&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-787487 ]
ASF GitHub Bot logged work on HIVE-25779: ----------------------------------------- Author: ASF GitHub Bot Created on: 04/Jul/22 05:06 Start Date: 04/Jul/22 05:06 Worklog Time Spent: 10m Work Description: dengzhhu653 commented on code in PR #3221: URL: https://github.com/apache/hive/pull/3221#discussion_r912629023 ########## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java: ########## @@ -4220,6 +4236,24 @@ private Set<MColumnDescriptor> detachCdsFromSdsNoTxn( } } + private Set<MSerDeInfo> detachSerDeInfosFromSdsNoTxn(String catName, String dbName, String tblName, Review Comment: could this method be unified with detachCdsFromSdsNoTxn? as well as removeUnusedColumnDescriptor and removeUnusedSerDeInfo? Issue Time Tracking ------------------- Worklog Id: (was: 787487) Time Spent: 1.5h (was: 1h 20m) > Deduplicate SerDe Info > ---------------------- > > Key: HIVE-25779 > URL: https://issues.apache.org/jira/browse/HIVE-25779 > Project: Hive > Issue Type: New Feature > Components: Standalone Metastore > Reporter: Yu-Wen Lai > Assignee: Yu-Wen Lai > Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > The proposal is that we can reuse serde info as how we reuse column > descriptors. (HIVE-2246) > Currently, we store the metadata for partitions as PARTITIONS (N partitions) > -> SDS (N locations) -> SERDES (N entries). However, all the SERDES for the > partitions in a table are the same if we don't explicitly specify it. That > is, each storage descriptor has a associated and exclusive serde info, but > the partitions' serde infos are mostly just the same as the table's. By > reusing the serde info, we can save some database storage and enhance the > query performance from HMS to the backend database. > For backward compatibility, we also need to introduce a config for this > feature because there will be issues if HMS old instance and HMS new instance > with this feature are running together. With this feature, we will need to > check if others reference the serdes before deleting it, but the old instance > will just delete it. > The other thing we need to take care of is custom serdes. If a partition's > serde is modified, we need to create a new record in SERDES so that we don't > interfere other partitions. -- This message was sent by Atlassian Jira (v8.20.10#820010)