Cached directory size in INodeDirectory can get permantently out of sync with computed size, causing quota issues -----------------------------------------------------------------------------------------------------------------
Key: HDFS-3061 URL: https://issues.apache.org/jira/browse/HDFS-3061 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.20.203.0 Reporter: Alex Holmes It appears that there's a condition under which a HDFS directory with a quota set can get to a point where the cached size for the directory can permanently differ from the computed value. When this happens the following command: {code} hadoop fs -count -q /tmp/quota-test {code} results in the following output in the NameNode logs: {code} WARN org.apache.hadoop.hdfs.server.namenode.NameNode: Inconsistent diskspace for directory quota-test. Cached: 6000 Computed: 6072 {code} I've observed both transient and persistent instances of this happening. In the transient instances this warning goes away, but in the persistent instances every invocation of the {{fs -count -q}} command yields the above warning. I've seen instances where the actual disk usage of a directory is 25% of the cached value in INodeDirectory, which creates problems since the quota code uses this cached value to determine whether block write requests are permitted. This isn't easy to reproduce - I am able to (inconsistently) get HDFS into this state with a simple program which: # Writes files into HDFS # When a DSQuotaExceededException is encountered removes all files created in step 1 # Repeat step 1 I'm going to try and come up with a more repeatable test case to reproduce this issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira