Re: HDFS reports corrupted blocks after HBase reinstall

Jean-Daniel Cryans Wed, 27 Apr 2011 09:43:53 -0700

I don't remember ever seeing this :|

Was your secondary namenode running on a different host or storing its
data in a different folder? Was that wiped out too?


J-D

On Wed, Apr 27, 2011 at 8:28 AM, Jonathan Bender
<[email protected]> wrote:
> So it's definitely a case of HDFS not being able to recover the image.
>  Maybe this is better directed toward another list, but has anyone had
> issues with this, or any suggestions for trying to eradicate this?
>
>
>
>
> 2011-04-26 17:15:56,898 INFO org.apache.hadoop.hdfs.server.common.Storage:
> Recovering storage directory /var/lib/hadoop-0.20/cache/hadoop/dfs/name from
> failed checkpoint.
> 2011-04-26 17:15:56,905 INFO org.apache.hadoop.hdfs.server.common.Storage:
> Number of files = 204
> 2011-04-26 17:15:57,020 INFO org.apache.hadoop.hdfs.server.common.Storage:
> Number of files under construction = 0
> 2011-04-26 17:15:57,021 INFO org.apache.hadoop.hdfs.server.common.Storage:
> Image file of size 26833 loaded in 0 seconds.
> 2011-04-26 17:15:57,257 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Invalid opcode, reached
> end of edit log Number of transactions found 528
> 2011-04-26 17:15:57,258 INFO org.apache.hadoop.hdfs.server.common.Storage:
> Edits file /var/lib/hadoop-0.20/cache/hadoop/dfs/name/current/edits of size
> 1049092 edits # 528 loaded in 0 seconds.
> 2011-04-26 17:15:57,265 ERROR org.apache.hadoop.hdfs.server.common.Storage:
> Unable to save image for /var/lib/hadoop-0.20/cache/hadoop/dfs/name
> java.io.IOException: saveLeases found path /hbase/base_tmp/.logs/
> sv004.my.domain.com,60020,1302882411768/sv004.my.domain.com%3A60020.1302882412951
> but no matching entry in namespace.
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:5153)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImage(FSImage.java:1071)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSImage.saveCurrent(FSImage.java:1170)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:1118)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:347)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:321)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:267)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:461)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1202)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1211)
> 2011-04-26 17:15:57,273 WARN org.apache.hadoop.hdfs.server.common.Storage:
> FSImage:processIOError: removing storage:
> /var/lib/hadoop-0.20/cache/hadoop/dfs/name
> 2011-04-26 17:15:57,274 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading
> FSImage in 1553 msecs
>
>
>
>
> On Tue, Apr 26, 2011 at 5:19 PM, Jonathan Bender
> <[email protected]>wrote:
>
>> Wow, this is more intense than I thought...as soon as I load HBase again,
>> my HDFS filesystem reverts back to an older snapshot essentially.  As in, I
>> don't see any of the changes I had made since that time, in the hbase table
>> or otherwise.
>>
>> I'm using CDH3 beta 4, which I believe stores its local hbase data in a
>> different directory--not entirely sure where though.
>>
>> I'm not entirely sure what happened to mess this up, but it seems pretty
>> serious.
>>
>> On Tue, Apr 26, 2011 at 5:07 PM, Himanshu Vashishtha <
>> [email protected]> wrote:
>>
>>> Could it be the /tmp/hbase-<userID> directory that is playing the culprit.
>>> just a wild guess though.
>>>
>>>
>>> On Tue, Apr 26, 2011 at 5:56 PM, Jean-Daniel Cryans 
>>> <[email protected]>wrote:
>>>
>>>> Unless HBase was running when you wiped that out (and even then), I
>>>> don't see how this could happen. Could you match those blocks to the
>>>> files using fsck and figure when the files were created and if they
>>>> were part of the old install?
>>>>
>>>> Thx,
>>>>
>>>> J-D
>>>>
>>>> On Tue, Apr 26, 2011 at 4:53 PM, Jonathan Bender
>>>> <[email protected]> wrote:
>>>> > Hi all, I'm having a strange error which I can't exactly figure out.
>>>> >
>>>> > After wiping my /hbase HDFS directory to do a fresh install, I am
>>>> getting
>>>> > "MISSING BLOCKS" in this /hbase directory, which cause HDFS to start up
>>>> in
>>>> > safe mode.  This doesn't happen until I start my region servers, so I
>>>> have a
>>>> > feeling there is some kind of corrupted metadata that is being loaded
>>>> from
>>>> > these region servers.
>>>> >
>>>> > Is there a graceful way to wipe the HBase directory clean?  Any local
>>>> > directories on the region servers /master / ZK server that I should be
>>>> > wiping as well?
>>>> >
>>>> > Cheers,
>>>> > Jon
>>>> >
>>>>
>>>
>>>
>>
>

Re: HDFS reports corrupted blocks after HBase reinstall

Reply via email to