[jira] [Created] (HDFS-15715) ReplicatorMonitor performance degradation, when the storagePolicy of many file are not match with their real datanodestorage

zhengchenyu (Jira) Mon, 07 Dec 2020 05:58:16 -0800

zhengchenyu created HDFS-15715:
----------------------------------

             Summary: ReplicatorMonitor performance degradation, when the 
storagePolicy of many file are not match with their real datanodestorage 
                 Key: HDFS-15715
                 URL: https://issues.apache.org/jira/browse/HDFS-15715
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: hdfs
    Affects Versions: 3.2.1, 2.7.3
            Reporter: zhengchenyu
            Assignee: zhengchenyu
             Fix For: 3.3.1



One of our Namenode which has 300M files and blocks. In common way, this namode 
shoud not be in heavy load. But we found rpc process time keep high, and 
decommission is very slow.
 
I search the metrics, I found uderreplicated blocks keep high. Then I jstack 
namenode, found 'InnerNode.getLoc' is hot spot cod. I think maybe chooseTarget 
can't find block, so result to performance degradation. Consider with 
HDFS-10453, I guess maybe some logical trigger to the scene where chooseTarget 
can't find proper block.

Then I enable some debug. (Of course I revise some code so that only debug 
isGoodTarget, because enable BlockPlacementPolicy's debug log is dangrouse). I 
found "the rack has too many chosen nodes" is called. Then I found some log 
like this 

{code}
2020-12-04 12:13:56,345 WARN 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
place enough replicas, still in need of 0 to reach 3 (unavailableStorages=[], 
storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) For more 
information, please enable DEBUG log level on 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
2020-12-04 12:14:03,843 WARN 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to 
place enough replicas, still in need of 0 to reach 3 (unavailableStorages=[], 
storagePolicy=BlockStoragePolicy{COLD:2, storageTypes=[ARCHIVE], 
creationFallbacks=[], replicationFallbacks=[]}, newBlock=false) For more 
information, please enable DEBUG log level on 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
{code} 

Then through some debug and simulation, I found the reason, and reproduction 
this exception.

The reason is that some developer use COLD storage policy and mover, but the 
operatiosn of setting storage policy and mover are asynchronous. So some file's 
real  datastorages are not match with this storagePolicy.

Let me simualte this proccess. If /tmp/a is create, then have 2 replications 
are DISK. Then set storage policy to COLD. When some logical trigger(For 
example decommission) to copy this block. chooseTarget then use 
chooseStorageTypes to filter real needed block. Here the size of variable 
requiredStorageTypes which chooseStorageTypes returned is 3. But the size of  
result is 2. But 3 means need 3 ARCHIVE storage. 2 means bocks has 2 DISK 
storage. Then will request to choose 3 target. choose first target is right, 
but when choose seconde target, the variable 'counter' is 4 which is larger 
than maxTargetPerRack which is 3 in function isGoodTarget. So skip all 
datanodestorage. Then result to bad performance.

I think chooseStorageTypes need to consider the result, when the exist 
replication doesn't meet storage policy's demand, we need to remove this from 
result. 

I changed by this way, and test in my unit-test. Then solve it.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-15715) ReplicatorMonitor performance degradation, when the storagePolicy of many file are not match with their real datanodestorage

Reply via email to