[ 
https://issues.apache.org/jira/browse/HIVE-29165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18016926#comment-18016926
 ] 

Zhihua Deng commented on HIVE-29165:
------------------------------------

Thank you [~dkuzmenko] for the review!

> PartColNameInfo could introduce high hash collision due to the wide table
> -------------------------------------------------------------------------
>
>                 Key: HIVE-29165
>                 URL: https://issues.apache.org/jira/browse/HIVE-29165
>             Project: Hive
>          Issue Type: Improvement
>          Components: Standalone Metastore
>            Reporter: Zhihua Deng
>            Assignee: Zhihua Deng
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.1.0
>
>
> In {{{}DirectSqlUpdatePart{}}}, the {{PartColNameInfo}} acts as a map key, 
> referring to those statistics to be updated or inserted. If the current table 
> has lots of columns, say 1000, then for each {{{}PartColNameInfo{}}}, it’s 
> assumed to hash into the same map buckets more than 1000 times. If the table 
> has thousands of partitions, locating or inserting the {{PartColNameInfo}} 
> could be very slow.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to