I also suspect it might be caused by snapshots. We don't have lots of xattr or 
ALCs on our files.
How can I confirm if it is really caused by snapshots?

Thanks,
Jason

On 12/1/20, 4:43 PM, "Wei-Chiu Chuang" <weic...@cloudera.com.INVALID> wrote:

    Possibly snapshots.
    Also possible if each of your files has lots of xattr or ACLs.

    On Tue, Dec 1, 2020 at 3:52 PM Jason Wen <zhenshan....@workday.com.invalid>
    wrote:

    > Hi,
    >
    > We are encountering some odd FSImage size issue in one of our Hadoop
    > clusters. The Namenode only has about 3M files/blocks, but the FSImage 
size
    > is about 55GB.
    > We have never seen this kind of gap between number of files/blocks vs
    > FSImage size. As a comparison, we have another similar cluster which also
    > has ~3M files/blocks, but the FSImage size is only ~1GB. We also have
    > another cluster that has ~200M files/blocks but the FSImage size is only
    > ~45GB.
    >
    > My understanding is the FSImage size or the heap memory usage of Namenode
    > is mostly determined by the number of files/blocks. The gap that we
    > observed seems caused by other factors in Namenode FSImage/Namespace.
    >
    > Can anyone shed the light what could cause this FSImage issue?
    >
    > Thanks,
    > Jason
    >

Reply via email to