The heap histo does show it is caused by snapshot 2: 360695137 25970049864 org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy 3: 360695137 20198927672 org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiff 4: 363457663 11630645216 org.apache.hadoop.hdfs.server.namenode.XAttrFeature ... 7: 2954610 283642560 org.apache.hadoop.hdfs.server.namenode.INodeFile
Here is the snapshot info of HDFS "SnapshottableDirectories" : 86, "Snapshots" : 39864, Why is the number of FileDiff objects more than 100 times of INodeFile object? Thanks, Jason On 12/1/20, 7:23 PM, "Jason Wen" <zhenshan....@workday.com.INVALID> wrote: I also suspect it might be caused by snapshots. We don't have lots of xattr or ALCs on our files. How can I confirm if it is really caused by snapshots? Thanks, Jason On 12/1/20, 4:43 PM, "Wei-Chiu Chuang" <weic...@cloudera.com.INVALID> wrote: Possibly snapshots. Also possible if each of your files has lots of xattr or ACLs. On Tue, Dec 1, 2020 at 3:52 PM Jason Wen <zhenshan....@workday.com.invalid> wrote: > Hi, > > We are encountering some odd FSImage size issue in one of our Hadoop > clusters. The Namenode only has about 3M files/blocks, but the FSImage size > is about 55GB. > We have never seen this kind of gap between number of files/blocks vs > FSImage size. As a comparison, we have another similar cluster which also > has ~3M files/blocks, but the FSImage size is only ~1GB. We also have > another cluster that has ~200M files/blocks but the FSImage size is only > ~45GB. > > My understanding is the FSImage size or the heap memory usage of Namenode > is mostly determined by the number of files/blocks. The gap that we > observed seems caused by other factors in Namenode FSImage/Namespace. > > Can anyone shed the light what could cause this FSImage issue? > > Thanks, > Jason > ?B�KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCB�?�?[��X��ܚX�K??K[XZ[?�???��Y?]�][��X��ܚX�P??Y?��?�\?X�?K�ܙ�B��܈?Y??]?[ۘ[??��[X[�?�??K[XZ[?�???��Y?]�Z?[????Y?��?�\?X�?K�ܙ�B�