Reposting here to see if any of the HDFS developers have some good insight
into this.
Deep dive is in the below original message. The gist of it is after
upgrading to 2.7.2 on a ~260 node cluster, the active NN's fsimage download
and edit logs roll seem to get stuck in native FileChannel.force cal
ike that. In one of large clusters (5000+ node, 2.7.3ish, jdk8),
> rollEdits() takes less than 30ms consistently.
>
> Kihwal
>
>
> --
> *From:* Joey Paskhay
> *To:* hdfs-dev@hadoop.apache.org
> *Sent:* Tuesday, September 13, 2016 12:06