Hi, We are encountering some odd FSImage size issue in one of our Hadoop clusters. The Namenode only has about 3M files/blocks, but the FSImage size is about 55GB. We have never seen this kind of gap between number of files/blocks vs FSImage size. As a comparison, we have another similar cluster which also has ~3M files/blocks, but the FSImage size is only ~1GB. We also have another cluster that has ~200M files/blocks but the FSImage size is only ~45GB.
My understanding is the FSImage size or the heap memory usage of Namenode is mostly determined by the number of files/blocks. The gap that we observed seems caused by other factors in Namenode FSImage/Namespace. Can anyone shed the light what could cause this FSImage issue? Thanks, Jason