Yes, I think shutdown handler should not verify machine that is included 
deadservers set.

2011-07-28 22:25:09,336 INFO 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Waiting for split writer 
threads to finish
2011-07-28 22:25:09,450 INFO 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Split writers finished
2011-07-28 22:25:09,602 INFO 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Closed path 
hdfs://158.1.101.82:9000/hbase/.META./1028785192/recovered.edits/0000000000000025786
 (wrote 121 edits in 2567ms)
2011-07-28 22:25:09,860 INFO 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Closed path 
hdfs://158.1.101.82:9000/hbase/ufdr5/745550eb514e441f31ff26dbde8402ae/recovered.edits/0000000000000617740
 (wrote 211642 edits in 141887ms)

//split logs finished and assigned root table firstly .at the same time, region 
server came out of GC, verifying root passed.
2011-07-28 22:25:10,085 INFO 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: hlog file splitting 
completed in 316329 ms for 
hdfs://158.1.101.82:9000/hbase/.logs/158-1-101-82,20020,1311885942386

// region server is rejected and region server will shutdown itself. 
2011-07-28 22:28:30,577 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
Server REPORT rejected; currently processing 158-1-101-82,20020,1311885942386 
as dead server
2011-07-28 22:28:37,591 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
master:20000-0x23171e103d10018 Creating (or updating) unassigned node for 
1028785192 with OFFLINE state
2011-07-28 22:28:37,704 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
No previous transition plan was found (or we are ignoring an existing plan) for 
.META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, 
dest=158-1-101-202,20020,1311878322145; 2 (online=2, exclude=null) available 
servers
2011-07-28 22:28:37,704 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Assigning region .META.,,1.1028785192 to 158-1-101-202,20020,1311878322145
2011-07-28 22:28:37,733 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handling transition=M_ZK_REGION_OFFLINE, server=158-1-101-82:20000, 
region=1028785192/.META.

all logs is in attachment.

-----邮件原件-----
发件人: [email protected] [mailto:[email protected]] 代表 Stack
发送时间: 2011年8月16日 12:33
收件人: [email protected]
主题: Re: Root table couldn't be opened

On Mon, Aug 15, 2011 at 9:23 PM, Gaojinchao <[email protected]> wrote:
> Why did the master replay its logs if it did not exit?

Sorry.  Which logs?

> Zk is expired because of gc. But region server isn't shutdown.
>

Right, but it probably went down soon after it came out of GC, right?


> (I like how you noticed the log message that says 82 has root and meta)
>
> Added=158-1-101-82,20020,1311885942386 to dead servers, submitted shutdown 
> handler to be executed, root=true, meta=true
> It said that 82 has root and meta. "root=true" shows the dead region server 
> has root table.
>

So, you think there is a bug in our shutdown handler where we are not
doing -ROOT- processing properly?
St.Ack

Reply via email to