Hi Jonathan,

I've reproduced your issue.

I'll comment on HDFS-955 as I believe this is another manifestation of
the same issue.

-Todd

On Tue, Feb 9, 2010 at 12:11 PM, Todd Lipcon <t...@cloudera.com> wrote:
> Thanks Jonathan,
>
> I tried to reproduce this yesterday using a single dfs.name.dir, but
> I'll give it a go again with multiple.
>
> Will let you know what I turn up.
>
> -Todd
>
> On Tue, Feb 9, 2010 at 12:05 PM, Allen, Jonathan <jonathan.all...@hp.com> 
> wrote:
>> Todd,
>>
>> Unfortunately my test system is air gapped away from the internet so I 
>> haven't been able to transfer my test case across yet but the basic steps as 
>> are follows:
>>
>> 1) start-dfs (also shutdown the secondary to make sure that it didn't 
>> checkpoint away the edit log)
>> 2) create lots of small files so that there is a large edit log (I created 
>> about 4,500 files resulting in an edit log of just over 1MB).
>> 3) stop-dfs
>> 4) start-dfs
>> 5) wait for name node to start reading the edit log but not long enough for 
>> it to finish reading it (I waited for a couple of seconds).
>> 6) stop-dfs
>> 7) start-dfs
>> 8) listing the hdfs directory now shows it in the same state as at step (1) 
>> rather than the correct state as at step (3).
>>
>> This was running with the Yahoo distro of 0.20.1.
>>
>> The dfs.name.dir is configured to use directories on 2 local drives and 1 
>> NFS mounted drive.
>>
>> Thanks,
>> Jonathan
>>
>> Jonathan Allen
>> UKGP, NS&R, Defence and Security
>> HP Enterprise Services
>> Telephone +44 1682 292101
>> Email jonathan.allen...@hp.com
>> Street address, Unit 29, Alexandra Way, Ashchurch Business Park, Tewkesbury, 
>> Gloucestershire. GL20 8NB
>>
>> Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks RG12 
>> 1HN
>> Registered No: 690597 England
>> The contents of this message and any attachments to it are confidential and 
>> may be legally privileged. If you have received this message in error, you 
>> should delete it from your system immediately and advise the sender.
>> To any recipient of this message within HP, unless otherwise stated you 
>> should consider this message and attachments as "HP CONFIDENTIAL".
>>
>>
>> -----Original Message-----
>> From: Todd Lipcon [mailto:t...@cloudera.com]
>> Sent: 09 February 2010 01:11
>> To: hdfs-dev@hadoop.apache.org
>> Subject: Re: Name Node Corruption When Shutdown Too Soon
>>
>> Hi Jonathan,
>>
>> Another question: how have you configured dfs.name.dir? Do you have
>> several directories configured?
>>
>> Thanks
>> -Todd
>>
>> On Mon, Feb 8, 2010 at 4:45 PM, Todd Lipcon <t...@cloudera.com> wrote:
>>> Hey Jonathan,
>>>
>>> As Konstantin mentioned, I've been looking into a couple issues that
>>> could be related. At first glance it doesn't sound like you've run
>>> into quite the same thing.
>>>
>>> What version did you see this on? The steps to reproduce are something like:
>>>
>>> 1) Start a NN
>>> 2) Perform a bunch of edits so there is a large edit log
>>> 3) kill -9 the NN
>>> 4) start the NN again
>>> 5) while it is in the middle of replaying edits, kill -9 it again
>>> 6) start the NN, and lose all the previous edits?
>>>
>>> Or did I misunderstand what happened? If that sounds right, I'll give
>>> it a go and see if I can reproduce.
>>>
>>> Thanks
>>> -Todd
>>>
>>> On Sun, Feb 7, 2010 at 8:45 AM, Allen, Jonathan <jonathan.all...@hp.com> 
>>> wrote:
>>>> I've come across a name node bug and just wanted to check if it's a known 
>>>> issue before I formally raise it (I've had a quick look through the 
>>>> database but couldn't see anything obvious).
>>>>
>>>> If the name node is shut down before it has completed reading through the 
>>>> edit log then the edit log gets removed without the image file being 
>>>> updated.  This results in name node reverting to its previously saved 
>>>> state (out of sync with the data nodes) and the most recent edits getting 
>>>> lost.
>>>>
>>>> Does anybody recognise this as a known issue or should I raise it?
>>>>
>>>> Thanks,
>>>> Jonathan Allen
>>>> UKGP, NS&R, Defence and Security
>>>> HP Enterprise Services
>>>> Telephone +44 1682 292101
>>>> Email jonathan.allen...@hp.com
>>>> Street address, Unit 29, Alexandra Way, Ashchurch Business Park, 
>>>> Tewkesbury, Gloucestershire. GL20 8NB
>>>>
>>>> Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks 
>>>> RG12 1HN
>>>> Registered No: 690597 England
>>>> The contents of this message and any attachments to it are confidential 
>>>> and may be legally privileged. If you have received this message in error, 
>>>> you should delete it from your system immediately and advise the sender.
>>>> To any recipient of this message within HP, unless otherwise stated you 
>>>> should consider this message and attachments as "HP CONFIDENTIAL".
>>>>
>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to