Hi Jonathan, I've reproduced your issue.
I'll comment on HDFS-955 as I believe this is another manifestation of the same issue. -Todd On Tue, Feb 9, 2010 at 12:11 PM, Todd Lipcon <t...@cloudera.com> wrote: > Thanks Jonathan, > > I tried to reproduce this yesterday using a single dfs.name.dir, but > I'll give it a go again with multiple. > > Will let you know what I turn up. > > -Todd > > On Tue, Feb 9, 2010 at 12:05 PM, Allen, Jonathan <jonathan.all...@hp.com> > wrote: >> Todd, >> >> Unfortunately my test system is air gapped away from the internet so I >> haven't been able to transfer my test case across yet but the basic steps as >> are follows: >> >> 1) start-dfs (also shutdown the secondary to make sure that it didn't >> checkpoint away the edit log) >> 2) create lots of small files so that there is a large edit log (I created >> about 4,500 files resulting in an edit log of just over 1MB). >> 3) stop-dfs >> 4) start-dfs >> 5) wait for name node to start reading the edit log but not long enough for >> it to finish reading it (I waited for a couple of seconds). >> 6) stop-dfs >> 7) start-dfs >> 8) listing the hdfs directory now shows it in the same state as at step (1) >> rather than the correct state as at step (3). >> >> This was running with the Yahoo distro of 0.20.1. >> >> The dfs.name.dir is configured to use directories on 2 local drives and 1 >> NFS mounted drive. >> >> Thanks, >> Jonathan >> >> Jonathan Allen >> UKGP, NS&R, Defence and Security >> HP Enterprise Services >> Telephone +44 1682 292101 >> Email jonathan.allen...@hp.com >> Street address, Unit 29, Alexandra Way, Ashchurch Business Park, Tewkesbury, >> Gloucestershire. GL20 8NB >> >> Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks RG12 >> 1HN >> Registered No: 690597 England >> The contents of this message and any attachments to it are confidential and >> may be legally privileged. If you have received this message in error, you >> should delete it from your system immediately and advise the sender. >> To any recipient of this message within HP, unless otherwise stated you >> should consider this message and attachments as "HP CONFIDENTIAL". >> >> >> -----Original Message----- >> From: Todd Lipcon [mailto:t...@cloudera.com] >> Sent: 09 February 2010 01:11 >> To: hdfs-dev@hadoop.apache.org >> Subject: Re: Name Node Corruption When Shutdown Too Soon >> >> Hi Jonathan, >> >> Another question: how have you configured dfs.name.dir? Do you have >> several directories configured? >> >> Thanks >> -Todd >> >> On Mon, Feb 8, 2010 at 4:45 PM, Todd Lipcon <t...@cloudera.com> wrote: >>> Hey Jonathan, >>> >>> As Konstantin mentioned, I've been looking into a couple issues that >>> could be related. At first glance it doesn't sound like you've run >>> into quite the same thing. >>> >>> What version did you see this on? The steps to reproduce are something like: >>> >>> 1) Start a NN >>> 2) Perform a bunch of edits so there is a large edit log >>> 3) kill -9 the NN >>> 4) start the NN again >>> 5) while it is in the middle of replaying edits, kill -9 it again >>> 6) start the NN, and lose all the previous edits? >>> >>> Or did I misunderstand what happened? If that sounds right, I'll give >>> it a go and see if I can reproduce. >>> >>> Thanks >>> -Todd >>> >>> On Sun, Feb 7, 2010 at 8:45 AM, Allen, Jonathan <jonathan.all...@hp.com> >>> wrote: >>>> I've come across a name node bug and just wanted to check if it's a known >>>> issue before I formally raise it (I've had a quick look through the >>>> database but couldn't see anything obvious). >>>> >>>> If the name node is shut down before it has completed reading through the >>>> edit log then the edit log gets removed without the image file being >>>> updated. This results in name node reverting to its previously saved >>>> state (out of sync with the data nodes) and the most recent edits getting >>>> lost. >>>> >>>> Does anybody recognise this as a known issue or should I raise it? >>>> >>>> Thanks, >>>> Jonathan Allen >>>> UKGP, NS&R, Defence and Security >>>> HP Enterprise Services >>>> Telephone +44 1682 292101 >>>> Email jonathan.allen...@hp.com >>>> Street address, Unit 29, Alexandra Way, Ashchurch Business Park, >>>> Tewkesbury, Gloucestershire. GL20 8NB >>>> >>>> Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks >>>> RG12 1HN >>>> Registered No: 690597 England >>>> The contents of this message and any attachments to it are confidential >>>> and may be legally privileged. If you have received this message in error, >>>> you should delete it from your system immediately and advise the sender. >>>> To any recipient of this message within HP, unless otherwise stated you >>>> should consider this message and attachments as "HP CONFIDENTIAL". >>>> >>>> >>>> >>>> >>> >> >