Thanks Jonathan,

I tried to reproduce this yesterday using a single dfs.name.dir, but
I'll give it a go again with multiple.

Will let you know what I turn up.

-Todd

On Tue, Feb 9, 2010 at 12:05 PM, Allen, Jonathan <jonathan.all...@hp.com> wrote:
> Todd,
>
> Unfortunately my test system is air gapped away from the internet so I 
> haven't been able to transfer my test case across yet but the basic steps as 
> are follows:
>
> 1) start-dfs (also shutdown the secondary to make sure that it didn't 
> checkpoint away the edit log)
> 2) create lots of small files so that there is a large edit log (I created 
> about 4,500 files resulting in an edit log of just over 1MB).
> 3) stop-dfs
> 4) start-dfs
> 5) wait for name node to start reading the edit log but not long enough for 
> it to finish reading it (I waited for a couple of seconds).
> 6) stop-dfs
> 7) start-dfs
> 8) listing the hdfs directory now shows it in the same state as at step (1) 
> rather than the correct state as at step (3).
>
> This was running with the Yahoo distro of 0.20.1.
>
> The dfs.name.dir is configured to use directories on 2 local drives and 1 NFS 
> mounted drive.
>
> Thanks,
> Jonathan
>
> Jonathan Allen
> UKGP, NS&R, Defence and Security
> HP Enterprise Services
> Telephone +44 1682 292101
> Email jonathan.allen...@hp.com
> Street address, Unit 29, Alexandra Way, Ashchurch Business Park, Tewkesbury, 
> Gloucestershire. GL20 8NB
>
> Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks RG12 
> 1HN
> Registered No: 690597 England
> The contents of this message and any attachments to it are confidential and 
> may be legally privileged. If you have received this message in error, you 
> should delete it from your system immediately and advise the sender.
> To any recipient of this message within HP, unless otherwise stated you 
> should consider this message and attachments as "HP CONFIDENTIAL".
>
>
> -----Original Message-----
> From: Todd Lipcon [mailto:t...@cloudera.com]
> Sent: 09 February 2010 01:11
> To: hdfs-dev@hadoop.apache.org
> Subject: Re: Name Node Corruption When Shutdown Too Soon
>
> Hi Jonathan,
>
> Another question: how have you configured dfs.name.dir? Do you have
> several directories configured?
>
> Thanks
> -Todd
>
> On Mon, Feb 8, 2010 at 4:45 PM, Todd Lipcon <t...@cloudera.com> wrote:
>> Hey Jonathan,
>>
>> As Konstantin mentioned, I've been looking into a couple issues that
>> could be related. At first glance it doesn't sound like you've run
>> into quite the same thing.
>>
>> What version did you see this on? The steps to reproduce are something like:
>>
>> 1) Start a NN
>> 2) Perform a bunch of edits so there is a large edit log
>> 3) kill -9 the NN
>> 4) start the NN again
>> 5) while it is in the middle of replaying edits, kill -9 it again
>> 6) start the NN, and lose all the previous edits?
>>
>> Or did I misunderstand what happened? If that sounds right, I'll give
>> it a go and see if I can reproduce.
>>
>> Thanks
>> -Todd
>>
>> On Sun, Feb 7, 2010 at 8:45 AM, Allen, Jonathan <jonathan.all...@hp.com> 
>> wrote:
>>> I've come across a name node bug and just wanted to check if it's a known 
>>> issue before I formally raise it (I've had a quick look through the 
>>> database but couldn't see anything obvious).
>>>
>>> If the name node is shut down before it has completed reading through the 
>>> edit log then the edit log gets removed without the image file being 
>>> updated.  This results in name node reverting to its previously saved state 
>>> (out of sync with the data nodes) and the most recent edits getting lost.
>>>
>>> Does anybody recognise this as a known issue or should I raise it?
>>>
>>> Thanks,
>>> Jonathan Allen
>>> UKGP, NS&R, Defence and Security
>>> HP Enterprise Services
>>> Telephone +44 1682 292101
>>> Email jonathan.allen...@hp.com
>>> Street address, Unit 29, Alexandra Way, Ashchurch Business Park, 
>>> Tewkesbury, Gloucestershire. GL20 8NB
>>>
>>> Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks RG12 
>>> 1HN
>>> Registered No: 690597 England
>>> The contents of this message and any attachments to it are confidential and 
>>> may be legally privileged. If you have received this message in error, you 
>>> should delete it from your system immediately and advise the sender.
>>> To any recipient of this message within HP, unless otherwise stated you 
>>> should consider this message and attachments as "HP CONFIDENTIAL".
>>>
>>>
>>>
>>>
>>
>

Reply via email to