I think I found the thread that is stuck. Is restarting the server harmless in this state?
"RS_CLOSE_REGION-hdfs-ix03.se-ix.delta.prod,60020,1424687995350-1" prio=10 tid=0x00007f75a0008000 nid=0x23ee in Object.wait() [0x00007f757d30b000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:503) at org.apache.hadoop.hdfs.DFSOutputStream.waitAndQueueCurrentPacket(DFSOutputStream.java:1411) - locked <0x00000007544573e8> (a java.util.LinkedList) at org.apache.hadoop.hdfs.DFSOutputStream.writeChunk(DFSOutputStream.java:1479) - locked <0x0000000756780218> (a org.apache.hadoop.hdfs.DFSOutputStream) at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:173) at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:116) at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:102) - locked <0x0000000756780218> (a org.apache.hadoop.hdfs.DFSOutputStream) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:54) at java.io.DataOutputStream.write(DataOutputStream.java:107) - locked <0x00000007543ef268> (a org.apache.hadoop.hdfs.client.HdfsDataOutputStream) at java.io.FilterOutputStream.write(FilterOutputStream.java:97) at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:1061) at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:1047) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:952) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:935) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:844) at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:403) at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:1272) at org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:835) - locked <0x000000075d8b2110> (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.Store.flushCache(Store.java:746) at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.flushCache(Store.java:2348) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1580) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1479) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:992) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:956) - locked <0x000000075d97b628> (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:119) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) On Sat, Mar 14, 2015 at 9:43 PM, Ted Yu <[email protected]> wrote: > bq. flush the region manually using shell? > > I doubt that would work - you can give it a try. > Please take jstack of region server in case you need to restart the server. > > BTW HBASE-10499 didn't go into 0.94 (maybe it should have). Please consider > upgrading. > > Cheers > > On Sat, Mar 14, 2015 at 1:30 PM, Kristoffer Sjögren <[email protected]> > wrote: > > > Hi Ted > > > > Sorry I forgot to mention, hbase-0.94.6 cdh 4.4. > > > > Yeah, it was a pretty write intensive scenario that I think triggered it > > (importing a lot of datapoints into opentsdb). > > > > Do I flush the region manually using shell? > > > > Cheers, > > -Kristoffer > > > > On Sat, Mar 14, 2015 at 9:22 PM, Ted Yu <[email protected]> wrote: > > > > > Which release of HBase are you using ? > > > > > > I wonder if your cluster was hit with HBASE-10499. > > > > > > Cheers > > > > > > On Sat, Mar 14, 2015 at 1:13 PM, Kristoffer Sjögren <[email protected]> > > > wrote: > > > > > > > Hi > > > > > > > > It seems one of our region servers has been stuck closing a region > for > > > > almost 22 hours. Puts or gets eventually fail with an exception [1]. > > > > > > > > Is there any safe way to release the region like restarting the > region > > > > server? > > > > > > > > Cheers, > > > > -Kristoffer > > > > > > > > > > > > [1] > > > > > > > > 2015-03-14 21:02:24,316 INFO > > > org.apache.hadoop.hbase.regionserver.HRegion: > > > > Failed to unblock updates for region > > > > tsdb,\x00\x00\x9ETU\xAC@ > > > > > > > > > > \x00\x00\x01\x00\x00\xAD\x00\x00\x05\x00\x00\xA7,1426282871862.4512f92b3d81e9142542d3b458223b63. > > > > 'IPC Server handler 9 on 60020' in 60000ms. The region is still busy. > > > > 2015-03-14 21:02:24,316 ERROR > > > > org.apache.hadoop.hbase.regionserver.HRegionServer: > > > > org.apache.hadoop.hbase.RegionTooBusyException: region is flushing > > > > at > > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:2731) > > > > at > org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:2002) > > > > at > > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:2114) > > > > at sun.reflect.GeneratedMethodAccessor109.invoke(Unknown Source) > > > > at > > > > > > > > > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > > > at java.lang.reflect.Method.invoke(Method.java:606) > > > > at > > > > > > > > > > > > > > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) > > > > at > > > > > > > > > > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1428) > > > > > > > > > >
