Hi Chris,

BCC'ing hdfs-dev@ since you're using CDH, moving us to cdh-user@.

You should be able to manually copy the under-replicated blocks and md5
files to a different datanode and restart it. I'm curious that you're
having this issue though, I haven't encountered it before. Can you send
your NN logs to me, either as an attachment or a file drop? Also, what
version of CDH are you using?

Here are also a few ideas for things you can check:

* There are a number of block replication stats available in the NN /jmx
webui, e.g. PendingReplicationBlocks, UnderReplicatedBlocks,
ScheduledReplicationBlocks. This will let you know if the NN is at least
attempting to replicate your blocks (pending and scheduled).
* Look in the NN log for BlockPlacementPolicy errors. It'll help to enable
DEBUG level output here.

Best,
Andrew


On Thu, Jan 9, 2014 at 10:46 AM, Cooper Bethea <co...@siftscience.com>wrote:

> I have only 9 under-replicated blocks on the cluster, and it is very
> important that I restore my cluster to a fully-replicated state. Is there a
> way I can manually copy these blocks to other datanodes, or perhaps new
> datanodes?
>
>
> On Thu, Jan 9, 2014 at 10:34 AM, Cooper Bethea <co...@siftscience.com
> >wrote:
>
> > Chris, Steve, thanks for responding.
> >
> > Overnight I ran a script to bump replication, then lower it, as Chris
> > suggested. There has been no effect--all underreplicated blocks still
> have
> > only 1 replica.
> >
> > Steve, I am running the rebalancer.
> >
> >
> > On Thu, Jan 9, 2014 at 1:33 AM, Steve Loughran <ste...@hortonworks.com
> >wrote:
> >
> >> are you  running the rebalancer?
> >>
> >>
> >> On 9 January 2014 04:40, Chris Embree <cemb...@gmail.com> wrote:
> >>
> >> > It's too bad that this hasn't been corrected in HDFS 2.0....  I have a
> >> > script that I run several times a day to ensure that blocks are
> >> replicated
> >> > correctly.  Here a link to an article about it:
> >> > http://dataforprofit.com/?p=427
> >> >
> >> >
> >> > On Wed, Jan 8, 2014 at 9:00 PM, Cooper Bethea <co...@siftscience.com>
> >> > wrote:
> >> >
> >> > > Following on--is there a way that I can forcibly replicate these
> >> blocks,
> >> > > perhaps by rsyncing the underlying files to other datanodes? As you
> >> might
> >> > > imagine under-replicated data makes me very uneasy.
> >> > >
> >> > >
> >> > > On Wed, Jan 8, 2014 at 12:00 PM, Cooper Bethea <
> co...@siftscience.com
> >> > > >wrote:
> >> > >
> >> > > > Hi HDFS developers,
> >> > > >
> >> > > > I have a worrying problem in a 2.0.0-cdh4.4.0 HDFS cluster I am
> >> > running.
> >> > > 9
> >> > > > blocks in the cluster are persistently reported to be
> >> under-replicated
> >> > > per
> >> > > > "hdfs fsck".
> >> > > >
> >> > > > I am able to fetch the files that contain these blocks, so I know
> >> that
> >> > > the
> >> > > > data is there, but for some reason replication is not taking
> >> effect. In
> >> > > > hopes of getting the cluster to notice that there were
> >> under-replicated
> >> > > > blocks I tried using "hdfs dfs -setrep" to raise the replication
> >> > factor,
> >> > > > but the cluster continues to report a single replica for each of
> >> these
> >> > > > blocks. When viewing master logs I see that the replication factor
> >> > change
> >> > > > is respected, but there are no messages that refer to the
> >> > > under-replicated
> >> > > > blocks.
> >> > > >
> >> > > > Thanks for your time. Please let me know what I can do to
> >> investigate
> >> > > > further.
> >> > > >
> >> > >
> >> >
> >>
> >> --
> >> CONFIDENTIALITY NOTICE
> >> NOTICE: This message is intended for the use of the individual or entity
> >> to
> >> which it is addressed and may contain information that is confidential,
> >> privileged and exempt from disclosure under applicable law. If the
> reader
> >> of this message is not the intended recipient, you are hereby notified
> >> that
> >> any printing, copying, dissemination, distribution, disclosure or
> >> forwarding of this communication is strictly prohibited. If you have
> >> received this communication in error, please contact the sender
> >> immediately
> >> and delete it from your system. Thank You.
> >>
> >
> >
>

Reply via email to