Hi Jinsong,

I don't see any data loss here.

The sequence of events from the logs:

==> NN allocates block:
2010-05-18 21:21:29,731 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.allocateBlock: /hbase/.META./1028785192/info/656097411976846533.
blk_5636039758999247483_31304886

===> First DN reports it has received block
2010-05-18 21:21:29,913 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.addStoredBlock: blockMap updated: 10.110.8.63:50010 is added to
blk_5636039758999247483_31304886 size 441

===> Client calls completeFile
2010-05-18 21:21:29,913 INFO org.apache.hadoop.hdfs.StateChange: DIR*
NameSystem.completeFile: file
/hbase/.META./1028785192/info/656097411976846533 is closed by
DFSClient_-919320526

===> 2nd and 3rd DN have not yet heartbeated since receiving the block, so
replication count is low, and unnecessary replication is scheduled. This is
a known issue - I was actually meaning to file a JIRA about it this week.
2010-05-18 21:21:31,987 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* ask
10.110.8.63:50010 to replicate blk_5636039758999247483_31304886 to
datanode(s) 10.110.8.86:50010 10.110.8.69:50010

===> Other DNs check in (within 4-5 seconds of file completion, which is
reasonable heartbeat time)
2010-05-18 21:21:33,156 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.addStoredBlock: blockMap updated: 10.110.8.69:50010 is added to
blk_5636039758999247483_31304886 size 441
2010-05-18 21:21:34,413 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.addStoredBlock: blockMap updated: 10.110.8.59:50010 is added to
blk_5636039758999247483_31304886 size 441

===> 8 seconds later the first replication goes through and cleanup of
excess replicas happens
2010-05-18 21:21:41,941 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.addStoredBlock: blockMap updated: 10.110.8.86:50010 is added to
blk_5636039758999247483_31304886 size 441
2010-05-18 21:21:41,941 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.chooseExcessReplicates: (10.110.8.63:50010,
blk_5636039758999247483_31304886) is added to recentInvalidateSets
2010-05-18 21:21:43,995 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* ask
10.110.8.63:50010 to delete  blk_5636039758999247483_31304886
blk_4349310048904429157_31304519

===> another 14 seconds later, the other replication goes through and
another excess is invalidated
2010-05-18 21:21:57,835 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.addStoredBlock: blockMap updated: 10.110.8.85:50010 is added to
blk_5636039758999247483_31304886 size 441
2010-05-18 21:21:57,835 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.chooseExcessReplicates: (10.110.8.69:50010,
blk_5636039758999247483_31304886) is added to recentInvalidateSets

===> about 5 minutes later, the regionserver performs a compaction and asks
the NN to delete this file
2010-05-18 21:26:39,388 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.addToInvalidates: blk_5636039758999247483 is added to invalidSet
of 10.110.8.63:50010
2010-05-18 21:26:39,388 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.addToInvalidates: blk_5636039758999247483 is added to invalidSet
of 10.110.8.69:50010
2010-05-18 21:26:39,388 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.addToInvalidates: blk_5636039758999247483 is added to invalidSet
of 10.110.8.59:50010
2010-05-18 21:26:39,388 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.addToInvalidates: blk_5636039758999247483 is added to invalidSet
of 10.110.8.86:50010
2010-05-18 21:26:39,388 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.addToInvalidates: blk_5636039758999247483 is added to invalidSet
of 10.110.8.85:50010
2010-05-18 21:26:39,389 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=hadoop,hadoop
      ip=/10.110.8.85 cmd=delete
 src=/hbase/.META./1028785192/info/656097411976846533    dst=null
 perm=null

As for the errors seen in the regionserver, the issue is that it called
open() before replication was done, and therefore only got one block
location. When the replica was removed there, the RS should have gone back
to the NN for more replicas. I'm not sure if the codebase you're running
includes HDFS-445. Judging from the CHANGES.txt, it appears not. So, it's
likely that you're hitting this bug where DFSClient wouldn't refetch block
locations from the NN for positional reads.

Thanks
-Todd


On Fri, May 21, 2010 at 3:32 PM, Jinsong Hu <jinsong...@hotmail.com> wrote:

> Hi, Todd:
>  I have cut all the logs for that 10 minutes from the 3 machines,  and if
> you grep 5636039758999247483, you will see
> all the logs from the regionserver, namenode, and data node. You can
> compare
> the time sequence.
> all our machine's time is sync'ed via ntp, so you can compare exactly side
> by side and show the
> sequence of events.
>  The build I took is from yahoo 0.20.9 distribution.
>
> Jinsong
>
> --------------------------------------------------
> From: "Todd Lipcon" <t...@cloudera.com>
> Sent: Friday, May 21, 2010 2:20 PM
> To: <hdfs-dev@hadoop.apache.org>
> Subject: Re: Fw: hadoop data loss issue discovered. please fix!
>
>
>  Hi Jinsong,
>>
>> Could you upload a tarball of the log files somewhere from each of the DNs
>> and the RS involved? It's hard to trace through the log in the email (the
>> email added all kinds of wrapping, etc)
>>
>> -Todd
>>
>> On Fri, May 21, 2010 at 2:17 PM, Jinsong Hu <jinsong...@hotmail.com>
>> wrote:
>>
>>  Hi, There:
>>>  While I used hadoop 0.20.9-yahoo distribution and hbase 0.20.4 version,
>>> I
>>> found that the hadoop lose blocks under certain
>>> situation, and thus corrupt hbase tables.
>>>
>>>
>>>
>>> I compared namenode, datanode and hbase regionserver and figured out the
>>> reason.
>>>
>>>
>>>
>>> The regionserver 10.110.8.85 asks namenode 10.110.8.83 to save a block,
>>> 10.110.8.84 gives
>>>
>>> Multiple IP,  regionserver choose 10.110.8.63 and save the block there.
>>> After a while,  namenode
>>>
>>> Asks the bock to be replicated to  10.110.8..86 and 10.110.8..69
>>> machines.
>>> A moment late, .86. .69 received
>>>
>>> The replication, but strangely,  10.110.8..59  10.110.8..85 also received
>>> replicaton of the block., even though it
>>>
>>> Is not in the replication list.
>>>
>>>
>>>
>>> Then the chooseExcessReplicates  asks to delele excess from .63, .69 ,
>>> thinking there are too
>>>
>>> Many replica.  Even though .63 was the original copy, the algorithm
>>> choose
>>> to delete block based
>>>
>>> On the amount of empty disk.  A moment later addToInvalidates ( not from
>>> chooseExcessReplicates  ) asks
>>>
>>> The block to be deleted on .86, .85, .59.  I check the code, this can
>>> only
>>> be possible if
>>>
>>> The block is corrupted.
>>>
>>>
>>>
>>> In the end, this block doesn't exist anywhere in the cluster. And it is
>>> permanently lost.
>>>
>>>
>>>
>>>
>>>
>>> namenode:
>>>
>>>
>>>
>>>
>>>
>>> 2010-05-18 21:21:41,941 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> NameSyst
>>>
>>> em.chooseExcessReplicates: (10.110.8.63:50010,
>>> blk_5636039758999247483_31304886)
>>>
>>> is added to recentInvalidateSets
>>>
>>> 2010-05-18 21:21:43,995 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> ask
>>> 10.1
>>>
>>> 10.8.63:50010 to delete  blk_5636039758999247483_31304886
>>> blk_434931004890442915
>>>
>>> 7_31304519
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> the block was initially added to 10.110.8.63, then replicated to
>>>
>>> 10.110.8.63 59 69 86 85   . subsequently, replication process
>>> addToInvalidates
>>>
>>> removed all of them. the code review shows that the replicate is corrupt,
>>>
>>> and all get deleted.
>>>
>>>
>>>
>>>
>>>
>>> 2010-05-18 21:21:29,913 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> NameSyst
>>>
>>> em.addStoredBlock: blockMap updated: 10.110.8.63:50010 is added to
>>> blk_563603975
>>>
>>> 8999247483_31304886 size 441
>>>
>>> 2010-05-18 21:21:31,987 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> ask
>>> 10.1
>>>
>>> 10.8.63:50010 to replicate blk_5636039758999247483_31304886 to
>>> datanode(s)
>>> 10.11
>>>
>>> 0.8.86:50010 10.110.8.69:50010
>>>
>>> 2010-05-18 21:21:41,941 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> NameSyst
>>>
>>> em.chooseExcessReplicates: (10.110.8.63:50010,
>>> blk_5636039758999247483_31304886)
>>>
>>> is added to recentInvalidateSets
>>>
>>> 2010-05-18 21:21:43,995 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> ask
>>> 10.1
>>>
>>> 10.8.63:50010 to delete  blk_5636039758999247483_31304886
>>> blk_434931004890442915
>>>
>>> 7_31304519
>>>
>>> 2010-05-18 21:26:39,388 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> NameSyst
>>>
>>> em.addToInvalidates: blk_5636039758999247483 is added to invalidSet of
>>> 10.110.8.
>>>
>>> 63:50010
>>>
>>> 2010-05-18 21:26:45,953 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> ask
>>> 10.1
>>>
>>> 10.8.63:50010 to delete  blk_-1838286221287242082_31305179
>>> blk_84467625641825134
>>>
>>> 17_31305143 blk_5636039758999247483_31304886
>>> blk_4628640249731313760_31305046 bl
>>>
>>> k_7460947863067370701_31270225 blk_-4468681536500281247_31270225
>>> blk_84535177111
>>>
>>> 01429609_31303917 blk_9126133835045521966_31303972
>>> blk_4623110280826973929_31305
>>>
>>> 203 blk_-2581238696314957800_31305033 blk_7461125351290749755_31305052
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> 2010-05-18 21:21:31,987 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> ask
>>> 10.1
>>>
>>> 10.8.63:50010 to replicate blk_5636039758999247483_31304886 to
>>> datanode(s)
>>> 10.11
>>>
>>> 0.8.86:50010 10.110.8.69:50010
>>>
>>> 2010-05-18 21:21:33,156 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> NameSyst
>>>
>>> em.addStoredBlock: blockMap updated: 10.110.8.69:50010 is added to
>>> blk_563603975
>>>
>>> 8999247483_31304886 size 441
>>>
>>> 2010-05-18 21:21:57,835 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> NameSyst
>>>
>>> em.chooseExcessReplicates: (10.110.8.69:50010,
>>> blk_5636039758999247483_31304886)
>>>
>>> is added to recentInvalidateSets
>>>
>>> 2010-05-18 21:21:59,005 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> ask
>>> 10.1
>>>
>>> 10.8.69:50010 to delete  blk_5636039758999247483_31304886
>>>
>>> 2010-05-18 21:26:39,388 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> NameSyst
>>>
>>> em.addToInvalidates: blk_5636039758999247483 is added to invalidSet of
>>> 10.110.8.
>>>
>>> 69:50010
>>>
>>> 2010-05-18 21:26:42,951 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> ask
>>> 10.1
>>>
>>> 10.8.69:50010 to delete  blk_-2124965527858346013_31270213
>>> blk_-5027506345849158
>>>
>>> 498_31270213 blk_5636039758999247483_31304886
>>> blk_9148821113904458973_31305189 b
>>>
>>> lk_4850797749721229572_31305072 blk_252039065084461924_31305031
>>> blk_-83518367280
>>>
>>> 09062091_31305208 blk_-7576696059515014894_31305194
>>> blk_-2900250119736465962_312
>>>
>>> 70214 blk_471700613578524871_31304950 blk_-190744003190006044_31305064
>>> blk_72650
>>>
>>> 57386742001625_31305073
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> 2010-05-18 21:21:31,987 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> ask
>>> 10.
>>>
>>> 10.8.63:50010 to replicate blk_5636039758999247483_31304886 to
>>> datanode(s)
>>> 10.1
>>>
>>> 0.8.86:50010 10.110.8.69:50010
>>>
>>> 2010-05-18 21:21:41,941 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> NameSys
>>>
>>> em.addStoredBlock: blockMap updated: 10.110.8.86:50010 is added to
>>> blk_56360397
>>>
>>> 8999247483_31304886 size 441
>>>
>>> 2010-05-18 21:26:39,388 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> NameSys
>>>
>>> em.addToInvalidates: blk_5636039758999247483 is added to invalidSet of
>>> 10.110.8
>>>
>>> 86:50010
>>>
>>> 2010-05-18 21:26:42,951 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> ask
>>> 10.
>>>
>>> 10.8.86:50010 to delete  blk_-6242136662924452584_31259201
>>> blk_5636039758999247
>>>
>>> 83_31304886 blk_4850797749721229572_31305072
>>> blk_252039065084461924_31305031
>>> bl
>>>
>>> _-1317144678443645904_31305204 blk_6050185755706975664_31270230
>>> blk_26714169718
>>>
>>> 5801868_31304948 blk_-5582352089328547938_31305022
>>> blk_-3115115738671914626_312
>>>
>>> 0210
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> 2010-05-18 21:21:34,413 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> NameSyst
>>>
>>> em.addStoredBlock: blockMap updated: 10.110.8.59:50010 is added to
>>> blk_563603975
>>>
>>> 8999247483_31304886 size 441
>>>
>>> 2010-05-18 21:26:39,388 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> NameSyst
>>>
>>> em.addToInvalidates: blk_5636039758999247483 is added to invalidSet of
>>> 10.110.8.
>>>
>>> 59:50010
>>>
>>> 2010-05-18 21:26:39,950 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> ask
>>> 10.1
>>>
>>> 10.8.59:50010 to delete  blk_5636039758999247483_31304886
>>> blk_-45285121566353996
>>>
>>> 25_31305212 blk_1439789418382469336_31305158
>>> blk_8860574934531794641_31270219 bl
>>>
>>> k_-8358193301564392132_31305029
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> 2010-05-18 21:21:57,835 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> NameSyst
>>>
>>> em.addStoredBlock: blockMap updated: 10.110.8.85:50010 is added to
>>> blk_563603975
>>>
>>> 8999247483_31304886 size 441
>>>
>>> 2010-05-18 21:26:39,388 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> NameSyst
>>>
>>> em.addToInvalidates: blk_5636039758999247483 is added to invalidSet of
>>> 10.110.8.
>>>
>>> 85:50010
>>>
>>> 2010-05-18 21:26:39,950 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>> ask
>>> 10.1
>>>
>>> 10.8.85:50010 to delete  blk_-6242136662924452584_31259201
>>> blk_56360397589992474
>>>
>>> 83_31304886 blk_4628640249731313760_31305046
>>> blk_4747588241975451642_31305123 bl
>>>
>>> k_-6876078628884993825_31270230 blk_-4468681536500281247_31270225
>>> blk_7325830193
>>>
>>> 509411302_31270230 blk_8453517711101429609_31303917
>>> blk_-6094734447689285387_313
>>>
>>> 05127 blk_3353439739797003235_31305037 blk_-5027506345849158498_31270213
>>> blk_148
>>>
>>> 4161645992497144_31270225 blk_4464987648045469454_31305144
>>> blk_74609478630673707
>>>
>>> 01_31270225 blk_-1170815606945644545_31270230
>>> blk_6050185755706975664_31270230 b
>>>
>>> lk_-8358193301564392132_31305029 blk_2671416971885801868_31304948
>>> blk_5593547375
>>>
>>> 459437465_31286511 blk_-2581238696314957800_31305033
>>> blk_4732635559915402193_312
>>>
>>> 70230 blk_-2124965527858346013_31270213 blk_-5837992573431863412_31286612
>>> blk_-4
>>>
>>> 32558447034944954_31270208 blk_-3407615138527189735_31305069
>>> blk_886057493453179
>>>
>>> 4641_31270219 blk_233110856487529716_31270229
>>> blk_312750273180273303_31270228 bl
>>>
>>> k_7461125351290749755_31305052 blk_-8902661185532055148_31304947
>>> blk_-8555258258
>>>
>>> 738129670_31270210 blk_252039065084461924_31305031
>>> blk_9037118763503479133_31305
>>>
>>> 120 blk_-8494656323754369174_31305105 blk_9126133835045521966_31303972
>>> blk_-5582
>>>
>>> 352089328547938_31305022 blk_-2900250119736465962_31270214
>>> blk_-3115115738671914
>>>
>>> 626_31270210 blk_7612090442234634555_31270225
>>> blk_5876492007747505188_31270213 b
>>>
>>> lk_471700613578524871_31304950 blk_-190744003190006044_31305064
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> datanode 10.110.8.63:
>>>
>>>
>>>
>>> 2010-05-18 21:21:46,058 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: De
>>>
>>> leting block blk_5636039758999247483_31304886 file
>>> /hadoop_data_dir/dfs/data/cur
>>>
>>> rent/subdir23/blk_5636039758999247483
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> hbase region server 10.110.8.85:
>>>
>>>
>>>
>>> DFSClient.java,
>>>
>>> DatanodeInfo chosenNode = bestNode(nodes, deadNodes);
>>>
>>> InetSocketAddress targetAddr =
>>> NetUtils.createSocketAddr(chosenNode.getName());
>>>
>>> return new DNAddrPair(chosenNode, targetAddr);
>>>
>>>
>>>
>>> still picked the 10.110.8.63 even though the command is sent from name
>>> node
>>> 21:21:43,995
>>>
>>> to delete the block, and it was executed at 21:21:46,058 . ?
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> 2010-05-18 21:21:46,188 WARN org.apache.hadoop.hdfs.DFSClient: Failed to
>>> connect
>>>
>>> to /10.110.8.63:50010 for file
>>> /hbase/.META./1028785192/info/656097411976846533
>>>
>>> for block 5636039758999247483:java.io.IOException: Got error in response
>>> to
>>> OP_
>>>
>>> READ_BLOCK for file /hbase/.META./1028785192/info/656097411976846533 for
>>> block 5
>>>
>>> 636039758999247483
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to