Hi, Stack. Very much thank you for your help: we have resolved this issue!
There was a stale *znode* in zookeeper tree that states that node is in 'enabling' state, but last modified several days ago: [zk: myserver:2181(CONNECTED) 0] ls /hbase/table [page] [zk: myserver:2181(CONNECTED) 1] ls /hbase/table/page [] [zk: myserver:2181(CONNECTED) 2] get /hbase/table/page � 11174@myserver*ENABLING* cZxid = 0x31a4a ctime = Fri Aug 10 11:23:18 EDT 2012 mZxid = 0x31c4f mtime = Fri Aug 10 11:24:57 EDT 2012 pZxid = 0x31a4a cversion = 0 dataVersion = 5 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 40 numChildren = 0 I did following to fix this issue: 1) stopped HBase: master and all region servers; 2) stopped Zookeeper; 3) made backup of Zookeeper data (/var/lib/zookeeper) 4) started Zookeeper; 5) removed znode using Zookeeper CLI: [zk: hbase01dev.303net.pvt:2181(CONNECTED) 3] delete /hbase/table/page [zk: hbase01dev.303net.pvt:2181(CONNECTED) 4] ls /hbase/table/page Node does not exist: /hbase/table/page 6) started HBase: mater and all region servers. After this everything was fine: the table was showing as 'enabled' and 'disable' worked as well: hbase(main):001:0> is_enabled 'page' true 0 row(s) in 0.7120 seconds hbase(main):002:0> disable 'page' 0 row(s) in 3.1160 seconds When it was in hung state, the table was actually served by RSes: I could count rows, do scans, run MR jobs using HBaseStorage pig class, etc. What was blocked is updates to table schema: alter did not work with table not in disabled state, but disabled did not work with table not in enabled state. All regions of the table were hosted by RSes. Here is excerpt from underlying HDFS structure: -rw-r--r-- 3 hbase hbase 1307 2012-08-10 11:24 /hbase/page/.tableinfo.0000000004 drwxr-xr-x - hbase hbase 0 2012-08-10 11:24 /hbase/page/.tmp drwxr-xr-x - hbase hbase 0 2012-08-16 18:55 /hbase/page/01084884c5d8b61a5a1e529822563cae -rw-r--r-- 3 hbase hbase 523 2012-08-13 16:39 /hbase/page/01084884c5d8b61a5a1e529822563cae/.regioninfo drwxr-xr-x - hbase hbase 0 2012-08-16 19:57 /hbase/page/01084884c5d8b61a5a1e529822563cae/.tmp drwxr-xr-x - hbase hbase 0 2012-08-17 03:28 /hbase/page/01084884c5d8b61a5a1e529822563cae/s -rw-rw-rw- 3 jenkins supergroup 742993 2012-08-17 03:08 /hbase/page/01084884c5d8b61a5a1e529822563cae/s/11adf78853944d02a3e39c1eb0b631a3 -rw-rw-rw- 3 jenkins supergroup 916762 2012-08-17 00:22 /hbase/page/01084884c5d8b61a5a1e529822563cae/s/a0a9c21d470549f9ab6c29d73d26ce8d -rw-r--r-- 3 hbase hbase 4713301 2012-08-16 18:55 /hbase/page/01084884c5d8b61a5a1e529822563cae/s/cf447b6576ad4cfe898dfee8e77c0e2c drwxr-xr-x - hbase hbase 0 2012-08-17 03:28 /hbase/page/01084884c5d8b61a5a1e529822563cae/t -rw-rw-rw- 3 jenkins supergroup 27844042 2012-08-17 00:22 /hbase/page/01084884c5d8b61a5a1e529822563cae/t/48a5c5cb10204854a7b76017145dfda7 -rw-r--r-- 3 hbase hbase 697429695 2012-08-16 19:57 /hbase/page/01084884c5d8b61a5a1e529822563cae/t/58b9027f020548e880f0d8c3c636ce18 -rw-rw-rw- 3 jenkins supergroup 15529996 2012-08-17 03:08 /hbase/page/01084884c5d8b61a5a1e529822563cae/t/bdc10a1f6285412caa60d23d745c1180 ... The question I have now is whether I had to stop whole hbase cluster or not? Is is safe to remove stale *znode* while HBase is operating, if I sure no compaction / splitting is going? -- Sincerely yours Pavel Vozdvizhenskiy Grid Dynamics / BigData On Fri, Aug 17, 2012 at 3:47 AM, Stack <[email protected]> wrote: > On Thu, Aug 16, 2012 at 3:48 PM, Pavel Vozdvizhenskiy > <[email protected]> wrote: > > I would appreciate on any help how to fix it. > > > > I've not come across this one before. > > If you list whats under /hbase/table? Does the table show there? You > could try removing the znode? You can look by doing ./bin/hbase zkcli > > St.Ack >
