Nathan Roberts created HDFS-11661:
-------------------------------------

             Summary: GetContentSummary uses excessive amounts of memory
                 Key: HDFS-11661
                 URL: https://issues.apache.org/jira/browse/HDFS-11661
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: namenode
    Affects Versions: 2.8.0
            Reporter: Nathan Roberts
            Priority: Blocker


ContentSummaryComputationContext::nodeIncluded() is being used to keep track of 
all INodes visited during the current content summary calculation. This can be 
all of the INodes in the filesystem, making for a VERY large hash table. This 
simply won't work on large filesystems. 

We noticed this after upgrading a namenode with ~100Million filesystem objects 
was spending significantly more time in GC. Fortunately this system had some 
memory breathing room, other clusters we have will not run with this additional 
demand on memory.

This was added as part of HDFS-10797 as a way of keeping track of INodes that 
have already been accounted for - to avoid double counting.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to