Hi folks,

We did the HDFS namenode swap with the same name but different ip
address so the clients don't need to change the configuration. While after
that some applications are seeing "Unable to close file because the last
block does not have enough number of replicas." and server side sees
"org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
BLOCK* * is COMMITTED but not COMPLETE(numNodes= 0 < minimum = 1) in file
*".

We were thinking that datanode needs to retry one more time to figure out
the right ip address of the active namenode, but the default timeout

dfs.client.block.write.locateFollowingBlock.retries:5
dfs.client.block.write.locateFollowingBlock.initial.delay.ms:400

 from the client side seems to be sufficient.

Want to check if anyone has seen this issue and what would be the possible
cause for that.

Thanks,
Aihua

Reply via email to