[ https://issues.apache.org/jira/browse/HIVE-29165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18016926#comment-18016926 ]
Zhihua Deng commented on HIVE-29165: ------------------------------------ Thank you [~dkuzmenko] for the review! > PartColNameInfo could introduce high hash collision due to the wide table > ------------------------------------------------------------------------- > > Key: HIVE-29165 > URL: https://issues.apache.org/jira/browse/HIVE-29165 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore > Reporter: Zhihua Deng > Assignee: Zhihua Deng > Priority: Major > Labels: pull-request-available > Fix For: 4.1.0 > > > In {{{}DirectSqlUpdatePart{}}}, the {{PartColNameInfo}} acts as a map key, > referring to those statistics to be updated or inserted. If the current table > has lots of columns, say 1000, then for each {{{}PartColNameInfo{}}}, it’s > assumed to hash into the same map buckets more than 1000 times. If the table > has thousands of partitions, locating or inserting the {{PartColNameInfo}} > could be very slow. -- This message was sent by Atlassian Jira (v8.20.10#820010)