[ 
https://issues.apache.org/jira/browse/HIVE-28523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liux updated HIVE-28523:
------------------------
    Attachment: ME1726238367718.jpg

> 删表或删分区时可能存在的性能问题
> ----------------
>
>                 Key: HIVE-28523
>                 URL: https://issues.apache.org/jira/browse/HIVE-28523
>             Project: Hive
>          Issue Type: Improvement
>      Security Level: Public(Viewable by anyone) 
>          Components: Standalone Metastore
>            Reporter: liux
>            Assignee: liux
>            Priority: Major
>         Attachments: ME1726238367718.jpg
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> 1. 删除表或者分区对象时的遍历可能存在性能问题
> 具体位置在:standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java
>  中的
> for{color:#1f2328} ({color}{color:#1f2328}String{color}{color:#1f2328} 
> {color}{color:#1f2328}partName{color}{color:#1f2328} : 
> {color}{color:#1f2328}partNames{color}{color:#1f2328}) {{color}
> {color:#1f2328}Path{color}{color:#1f2328} 
> {color}{color:#1f2328}partPath{color}{color:#1f2328} = 
> {color}{color:#1f2328}wh{color}{color:#1f2328}.{color}getDnsPath{color:#1f2328}({color}new{color:#1f2328}
>  
> {color}{color:#1f2328}Path{color}{color:#1f2328}({color}{color:#1f2328}pathString{color}{color:#1f2328}));{color}
> }
> 假定wh.getDnsPath一次耗时在10毫秒左右,那么对于20w分区对象的遍历,耗时为33分钟,这可能导致删大表或分区超时;
> 2.没有必要在遍历所有分区名时都执行wh.getDnsPath(new Path(pathString))语句,只需要在分区非表下子目录的情况执行就够了



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to