Thank you for reply.

It seems master had some problem. But I am not sure what 's up.
I am not familiar with RPC and need keep digging. 


// the master connetion was ok
2011-05-04 04:47:06,267 INFO org.apache.hadoop.hbase.master.CatalogJanitor: 
Scanned 378 catalog row(s) and gc'd 9 unreferenced parent region(s)
2011-05-04 04:52:05,833 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
Skipping load balancing.  servers=3 regions=371 average=123.666664 
mostloaded=124 leastloaded=124

//the connection had some problem.
2011-05-04 04:53:05,858 WARN org.apache.hadoop.hbase.master.CatalogJanitor: 
Failed scan of catalog table
java.net.SocketTimeoutException: Call to /10.85.129.177:60020 failed on socket 
timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while 
waiting for channel to be ready for read. ch : 
java.nio.channels.SocketChannel[connected local=/10.85.129.177:42736 
remote=/10.85.129.177:60020]
        at 
org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:802)
        at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:775)
        at 
org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
        at $Proxy6.getRegionInfo(Unknown Source)

// It was also fault.
2011-05-04 09:52:38,425 WARN org.apache.hadoop.hbase.master.CatalogJanitor: 
Failed scan of catalog table
java.net.SocketTimeoutException: Call to /10.85.129.177:60020 failed on socket 
timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while 
waiting for channel to be ready for read. ch : 
java.nio.channels.SocketChannel[connected local=/10.85.129.177:50964 
remote=linux-88.site/10.85.129.177:60020]
        at 
org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:802)
        at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:775)
        at 
org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
        at $Proxy6.getRegionInfo(Unknown Source)

// region server had done nothing between 04:47 and 04:52. 
2011-05-04 04:43:08,888 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: 
Roll 
/hbase/.logs/linux-88.site,60020,1304408517081/linux-88.site%3A60020.1304451798487,
 entries=1473, filesize=124253101. New hlog 
/hbase/.logs/linux-88.site,60020,1304408517081/linux-88.site%3A60020.1304455388698
2011-05-04 04:47:03,372 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: 
LRU Stats: total=2.56 MB, free=196.61 MB, max=199.18 MB, blocks=115, 
accesses=526799, hits=37164, hitRatio=7.05%%, cachingAccesses=37279, 
cachingHits=37164, cachingHitsRatio=99.69%%, evictions=0, evicted=0, 
evictedPerRun=NaN
2011-05-04 04:52:03,372 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: 
LRU Stats: total=2.56 MB, free=196.61 MB, max=199.18 MB, blocks=115, 
accesses=526913, hits=37278, hitRatio=7.07%%, cachingAccesses=37393, 
cachingHits=37278, cachingHitsRatio=99.69%%, evictions=0, evicted=0, 
evictedPerRun=NaN
2011-05-04 04:57:03,372 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: 
LRU Stats: total=2.56 MB, free=196.61 MB, max=199.18 MB, blocks=115, 
accesses=526913, hits=37278, hitRatio=7.07%%, cachingAccesses=37393, 
cachingHits=37278, cachingHitsRatio=99.69%%, evictions=0, evicted=0, 
evictedPerRun=NaN
2011-05-04 05:02:03,372 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: 
LRU Stats: total=2.56 MB, free=196.61 MB, max=199.18 MB, blocks=115, 
accesses=526913, hits=37278, hitRatio=7.07%%, cachingAccesses=37393, 
cachingHits=37278, cachingHitsRatio=99.69%%, evictions=0, evicted=0, 
evictedPerRun=NaN

-----邮件原件-----
发件人: [email protected] [mailto:[email protected]] 代表 Jean-Daniel Cryans
发送时间: 2011年5月10日 6:17
收件人: [email protected]
主题: Re: Hmaster has some warn logs.

Anything closer to 05:13:05 in the region server? That's an hour and a
half before.

J-D

On Thu, May 5, 2011 at 6:03 PM, Gaojinchao <[email protected]> wrote:
> Thanks for your reply.
>
> I got it. But master and Region server is same machine and the cluster is 
> free.
>
> In my scenario:
> 1/ before 2011-05-04 03:42. bulk data was putting and some IPC was blocked.
>
> 2011-05-04 03:42:39,155 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> Blocking updates for 'IPC Server handler 1 on 60020' on region 
> test1,2010010100#8613880051110#0000,1304451639350.dfbdc1cf590bfffb50ace4c7d116c524.:
>  memstore size 128.1m is >= than blocking 128.0m size
> 2011-05-04 03:42:39,194 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
> Blocking updates for 'IPC Server handler 6 on 60020' on region 
> test1,2010010100#8613880051110#0000,1304451639350.dfbdc1cf590bfffb50ace4c7d116c524.:
>  memstore size 128.4m is >= than blocking 128.0m size
>
>
> 2/ 2011-05-04 05:13 hbase client had exited and cluster was free.
>
> 2011-05-04 04:47:03,372 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: 
> LRU Stats: total=2.56 MB, free=196.61 MB, max=199.18 MB, blocks=115, 
> accesses=526799, hits=37164, hitRatio=7.05%%, cachingAccesses=37279, 
> cachingHits=37164, cachingHitsRatio=99.69%%, evictions=0, evicted=0, 
> evictedPerRun=NaN
> 2011-05-04 04:52:03,372 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: 
> LRU Stats: total=2.56 MB, free=196.61 MB, max=199.18 MB, blocks=115, 
> accesses=526913, hits=37278, hitRatio=7.07%%, cachingAccesses=37393, 
> cachingHits=37278, cachingHitsRatio=99.69%%, evictions=0, evicted=0, 
> evictedPerRun=NaN
> 2011-05-04 04:57:03,372 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: 
> LRU Stats: total=2.56 MB, free=196.61 MB, max=199.18 MB, blocks=115, 
> accesses=526913, hits=37278, hitRatio=7.07%%, cachingAccesses=37393, 
> cachingHits=37278, cachingHitsRatio=99.69%%, evictions=0, evicted=0, 
> evictedPerRun=NaN
> 2011-05-04 05:02:03,372 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: 
> LRU Stats: total=2.56 MB, free=196.61 MB, max=199.18 MB, blocks=115, 
> accesses=526913, hits=37278, hitRatio=7.07%%, cachingAccesses=37393, 
> cachingHits=37278, cachingHitsRatio=99.69%%, evictions=0, evicted=0, 
> evictedPerRun=NaN
> 2011-05-04 05:07:03,372 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: 
> LRU Stats: total=2.56 MB, free=196.61 MB, max=199.18 MB, blocks=115, 
> accesses=526913, hits=37278, hitRatio=7.07%%, cachingAccesses=37393, 
> cachingHits=37278, cachingHitsRatio=99.69%%, evictions=0, evicted=0, 
> evictedPerRun=NaN
> 2011-05-04 05:12:03,372 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: 
> LRU Stats: total=2.56 MB, free=196.61 MB, max=199.18 MB, blocks=115, 
> accesses=526913, hits=37278, hitRatio=7.07%%, cachingAccesses=37393, 
> cachingHits=37278, cachingHitsRatio=99.69%%, evictions=0, evicted=0, 
> evictedPerRun=NaN
> 2011-05-04 05:17:03,372 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: 
> LRU Stats: total=2.56 MB, free=196.61 MB, max=199.18 MB, blocks=115, 
> accesses=526913, hits=37278, hitRatio=7.07%%, cachingAccesses=37393, 
> cachingHits=37278, cachingHitsRatio=99.69%%, evictions=0, evicted=0, 
> evictedPerRun=NaN
> 2011-05-04 05:22:03,372 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: 
> LRU Stats: total=2.56 MB, free=196.61 MB, max=199.18 MB, blocks=115, 
> accesses=526913, hits=37278, hitRatio=7.07%%, cachingAccesses=37393, 
> cachingHits=37278, cachingHitsRatio=99.69%%, evictions=0, evicted=0, 
> evictedPerRun=NaN
>
> 3/ master had some warning log, but region server did not any abnormal.
>
> -----邮件原件-----
> 发件人: [email protected] [mailto:[email protected]] 代表 Jean-Daniel Cryans
> 发送时间: 2011年5月6日 1:27
> 收件人: [email protected]
> 主题: Re: Hmaster has some warn logs.
>
> The master was trying to talk to the region server at 10.85.129.177
> but it took more than 60 seconds to get an answer.
>
> J-D
>
> On Thu, May 5, 2011 at 5:40 AM, Gaojinchao <[email protected]> wrote:
>> In our test cluster. There is a lot of logs for socket timeout.
>> I try to dig it, but nothing.
>> Who have a clue? Thanks.
>>
>> Hbase version : 0.90.2
>>
>> Logs:
>> 2011-05-04 05:13:05,856 WARN org.apache.hadoop.hbase.master.CatalogJanitor: 
>> Failed scan of catalog table
>> java.net.SocketTimeoutException: Call to /10.85.129.177:60020 failed on 
>> socket timeout exception: java.net.SocketTimeoutException: 60000 millis 
>> timeout while waiting for channel to be ready for read. ch : 
>> java.nio.channels.SocketChannel[connected local=/10.85.129.177:54980 
>> remote=/10.85.129.177:60020]
>>        at 
>> org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:802)
>>        at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:775)
>>        at 
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
>>        at $Proxy6.getRegionInfo(Unknown Source)
>>        at 
>> org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRegionLocation(CatalogTracker.java:424)
>>        at 
>> org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:272)
>>        at 
>> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:331)
>>        at 
>> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnectionDefault(CatalogTracker.java:364)
>>        at 
>> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:255)
>>        at 
>> org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:237)
>>        at 
>> org.apache.hadoop.hbase.master.CatalogJanitor.scan(CatalogJanitor.java:116)
>>        at 
>> org.apache.hadoop.hbase.master.CatalogJanitor.chore(CatalogJanitor.java:85)
>>        at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
>> Caused by: java.net.SocketTimeoutException: 60000 millis timeout while 
>> waiting for channel to be ready for read. ch : 
>> java.nio.channels.SocketChannel[connected local=/10.85.129.177:54980 
>> remote=/10.85.129.177:60020]
>>        at 
>> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:165)
>>        at 
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
>>        at 
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
>>        at java.io.FilterInputStream.read(FilterInputStream.java:116)
>>        at 
>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection$PingInputStream.read(HBaseClient.java:299)
>>        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>>        at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>>        at java.io.DataInputStream.readInt(DataInputStream.java:370)
>>        at 
>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:539)
>>        at 
>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:477)
>>
>

Reply via email to