try to delete /system/balancer.id  and search some error or warn logs in 
namenode.


---- Replied Message ----
| From | Sébastien Rebecchi<srebec...@kameleoon.com.INVALID> |
| Date | 3/9/2025 23:08 |
| To | Zhanghaobo<hfutzhan...@163.com> |
| Cc | hadoop-user-maillist<u...@hadoop.apache.org>,
hdfs-dev<hdfs-dev@hadoop.apache.org> |
| Subject | Re: Can not run HDFS balancer cause metrics already exists |
I got the same error adding -asService in the command line (metrics already 
exists), the only diff is that it will retry every 5 mins 


2025-03-09 15:05:04,542 INFO balancer.Balancer: Finished one round, will wait 
for 5.0 minutes for next round


That does not seem a good workaround, my cluster have hundreds of TB to 
rebalance when adding a data node, and I don't remember having such issues when 
I was using hadoop 2.9.1.
Is there any issue with balancer on recent hadoop versions? 


Thanks,
Sébastien


Le dim. 9 mars 2025 à 16:02, Sébastien Rebecchi <srebec...@kameleoon.com> a 
écrit :

OK I can try then, hoping it will help.
Btw even if it works, it does not explain this metrics exception.
Any idea how to solve this, I can't find a way to delete that metrics in any 
hadoop doc.


Thanks


Sébastien.


Le dim. 9 mars 2025 à 15:39, Zhanghaobo <hfutzhan...@163.com> a écrit :

got it, you can use it as a service and see what will happen.


---- Replied Message ----
| From | Sébastien Rebecchi<srebec...@kameleoon.com> |
| Date | 03/09/2025 22:22 |
| To | Zhanghaobo<hfutzhan...@163.com> |
| Cc | u...@hadoop.apache.org、hdfs-dev@hadoop.apache.org |
| Subject | Re: Can not run HDFS balancer cause metrics already exists |
Hi Zhanghaobo,


Thanks for the message.


No I don't use as service, as I said the command line is the following: hdfs 
balancer -Ddfs.balancer.movedWinWidth=5400000 -Ddfs.balancer.moverThreads=1000 
-Ddfs.balancer.dispatcherThreads=200 
-Ddfs.datanode.balance.max.concurrent.moves=50 
-Ddfs.datanode.balance.bandwidthPerSec=100m 
-Ddfs.balancer.max-size-to-move=10737418240 -threshold 1


Also no other balancer is running concurrently on any other node.


Sébastien


Le dim. 9 mars 2025 à 13:57, Zhanghaobo <hfutzhan...@163.com> a écrit :

 
Hi,  @Sébastien Rebecchi  

Don't know more details about how you start balancer, did you use -asService?
 


---- Replied Message ----
| From | Sébastien Rebecchi<srebec...@kameleoon.com.INVALID> |
| Date | 3/9/2025 18:03 |
| To | <u...@hadoop.apache.org>,
<hdfs-dev@hadoop.apache.org> |
| Subject | Re: Can not run HDFS balancer cause metrics already exists |
Hello


Could anyone help on this please?
Situation is still the same after several days.
I add some precisions
- hadoop version 3.4.1
- balancer command line run: hdfs balancer -Ddfs.balancer.movedWinWidth=5400000 
-Ddfs.balancer.moverThreads=1000 -Ddfs.balancer.dispatcherThreads=200 
-Ddfs.datanode.balance.max.concurrent.moves=50 
-Ddfs.datanode.balance.bandwidthPerSec=100m 
-Ddfs.balancer.max-size-to-move=10737418240 -threshold 1


Thank you



Le mar. 4 mars 2025, 16:59, Sébastien Rebecchi <srebec...@kameleoon.com> a 
écrit :

Hello


After having added a new node on my HDFS cluster, I try running balancer, but 
it always fails with the following error, even after retrying multiple times 
during the day, and even after having restarted name node
What should I do to unlock?


Thanks,


Sébastien




ERROR balancer.Balancer: Exiting balancer due an exception
org.apache.hadoop.metrics2.MetricsException: Metrics source Balancer-{HERE 
REPLACE BY CLUSTER'S BLOCK POOL ID} already exists!
        at 
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:152)
        at 
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:125)
        at 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:229)
        at 
org.apache.hadoop.hdfs.server.balancer.BalancerMetrics.create(BalancerMetrics.java:52)
        at 
org.apache.hadoop.hdfs.server.balancer.Balancer.<init>(Balancer.java:362)
        at 
org.apache.hadoop.hdfs.server.balancer.Balancer.doBalance(Balancer.java:824)
        at 
org.apache.hadoop.hdfs.server.balancer.Balancer.run(Balancer.java:868)
        at 
org.apache.hadoop.hdfs.server.balancer.Balancer$Cli.run(Balancer.java:975)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
        at 
org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:1133)

Reply via email to