Are you using CMS? How big is young gen? How often does the NN do young gen collection when it is slow?
On Tue, Sep 25, 2018 at 4:04 AM Lin,Yiqun(vip.com) <yiqun01....@vipshop.com> wrote: > Hi hdfs developers: > > We meet a bad problem after rolling upgrade our hadoop version from > 2.5.0-cdh5.3.2 to 2.6.0-cdh5.13.1. The problem is that we find NN running > slow periodically (around a week). Concretely to say, For example, we > startup NN on Monday, it will run fast. But time coming to Weekends, our > cluster will become very slow. > > In the beginning, we thought maybe some FSN lock caused by this. And we > did some improvements for this, e.g. configurable the remove block > interval, print FSN lock elapsed time. After this, the problem still > exists, :(. So we suspect this maybe not a hdfs rpc problem. > > Finally we find a related phenomenon: every time NN runs slow, its old gen > reaches a high value, around 100GB. Actually, NN total metadata size is > just around 40GB in our clsuter. So for the temporary solution, we reduce > the heap space and trigger full gc frequently. Now it looks better than > before but we haven’t found the root cause of this. Not so sure if this is > a jvm tuning problem or a hdfs bug? > > Anyone who has met the similar problem in this version? Why the NN old gen > space greatly increased? > > Some information of our env: > JDK1.8 > 500+ Nodes, 150 million blocks, around 40GB metadata size will be used. > > Appreciate if anyone who can share your comments. > > Thanks > Yiqun. > 本电子邮件可能为保密文件。如果阁下非电子邮件所指定之收件人,谨请立即通知本人。敬请阁下不要使用、保存、复印、打印、散布本电子邮件及其内容,或将其用于其他任何目的或向任何人披露。谢谢您的合作! > This communication is intended only for the addressee(s) and may contain > information that is privileged and confidential. You are hereby notified > that, if you are not an intended recipient listed above, or an authorized > employee or agent of an addressee of this communication responsible for > delivering e-mail messages to an intended recipient, any dissemination, > distribution or reproduction of this communication (including any > attachments hereto) is strictly prohibited. If you have received this > communication in error, please notify us immediately by a reply e-mail > addressed to the sender and permanently delete the original e-mail > communication and any attachments from all storage devices without making > or otherwise retaining a copy. >