liux created HIVE-28523: --------------------------- Summary: 删表或删分区时可能存在的性能问题 Key: HIVE-28523 URL: https://issues.apache.org/jira/browse/HIVE-28523 Project: Hive Issue Type: Improvement Security Level: Public (Viewable by anyone) Components: Standalone Metastore Reporter: liux Assignee: liux
1. 删除表或者分区对象时的遍历可能存在性能问题 具体位置在:standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java 中的 for{color:#1f2328} ({color}{color:#1f2328}String{color}{color:#1f2328} {color}{color:#1f2328}partName{color}{color:#1f2328} : {color}{color:#1f2328}partNames{color}{color:#1f2328}) {{color} {color:#1f2328}Path{color}{color:#1f2328} {color}{color:#1f2328}partPath{color}{color:#1f2328} = {color}{color:#1f2328}wh{color}{color:#1f2328}.{color}getDnsPath{color:#1f2328}({color}new{color:#1f2328} {color}{color:#1f2328}Path{color}{color:#1f2328}({color}{color:#1f2328}pathString{color}{color:#1f2328}));{color} } 假定wh.getDnsPath一次耗时在10毫秒左右,那么对于20w分区对象的遍历,耗时为33分钟,这可能导致删大表或分区超时; 2.没有必要在遍历所有分区名时都执行wh.getDnsPath(new Path(pathString))语句,只需要在分区非表下子目录的情况执行就够了 -- This message was sent by Atlassian Jira (v8.20.10#820010)