Hi All, Having the problem below, I dont know how to troubleshoot it further, but I can provide any required information. Running CDH 4.3.0.0 aka 0.94.6 on RHEL 6.2
Hbase shows an increasing number of IPC Threads in BLOCKED state Hundreds of these,more and more appearing over hours, performance degrading, requiring regionserver restart to restore performance. Thread: Thread 421 (IPC Server handler 368 on 60201): State: BLOCKED Blocked count: 19314 Waited count: 322565 Blocked on org.apache.hadoop.metrics.util.MetricsIntValue@1ec5ca55 Blocked by 236 (IPC Server handler 183 on 60201) Stack: org.apache.hadoop.metrics.util.MetricsIntValue.set(MetricsIntValue.java:73) org.apache.hadoop.hbase.ipc.HBaseServer.updateCallQueueLenMetrics(HBaseServer.java:1360) org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1399) i dont actually know how to troubleshoot this much further... Happy to take suggestions... The attached graph shows the increasing number of blocked threads. The massive drops, are when we do rolling restarts of all the regionservers. You then see the number of blocked threads slowly starting to grow. We determine the number of blocked threads, by hitting the web interface, ie <host:60030>/dump, and then for each thread, counting the number of 'Status: BLOCKED' threads. Analysis of the Blocked threads, has revealed its blocked on updating Metrics