[ https://issues.apache.org/jira/browse/HIVE-25779?focusedWorklogId=787239&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-787239 ]
ASF GitHub Bot logged work on HIVE-25779: ----------------------------------------- Author: ASF GitHub Bot Created on: 01/Jul/22 22:39 Start Date: 01/Jul/22 22:39 Worklog Time Spent: 10m Work Description: hsnusonic commented on PR #3221: URL: https://github.com/apache/hive/pull/3221#issuecomment-1172766083 @saihemanth-cloudera @dengzhhu653 Could you please review the patch? Issue Time Tracking ------------------- Worklog Id: (was: 787239) Time Spent: 1h (was: 50m) > Deduplicate SerDe Info > ---------------------- > > Key: HIVE-25779 > URL: https://issues.apache.org/jira/browse/HIVE-25779 > Project: Hive > Issue Type: New Feature > Components: Standalone Metastore > Reporter: Yu-Wen Lai > Assignee: Yu-Wen Lai > Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > The proposal is that we can reuse serde info as how we reuse column > descriptors. (HIVE-2246) > Currently, we store the metadata for partitions as PARTITIONS (N partitions) > -> SDS (N locations) -> SERDES (N entries). However, all the SERDES for the > partitions in a table are the same if we don't explicitly specify it. That > is, each storage descriptor has a associated and exclusive serde info, but > the partitions' serde infos are mostly just the same as the table's. By > reusing the serde info, we can save some database storage and enhance the > query performance from HMS to the backend database. > For backward compatibility, we also need to introduce a config for this > feature because there will be issues if HMS old instance and HMS new instance > with this feature are running together. With this feature, we will need to > check if others reference the serdes before deleting it, but the old instance > will just delete it. > The other thing we need to take care of is custom serdes. If a partition's > serde is modified, we need to create a new record in SERDES so that we don't > interfere other partitions. -- This message was sent by Atlassian Jira (v8.20.10#820010)