Hm, I had hoped this would have been fixed in hdfs 2. I have a script that I run several times per day that identifies under replicated blocks and increases the replication factor by 1. It then reduces the replication factor back to normal.
I can dig up a link if you need it. On Jan 8, 2014 9:00 PM, "Cooper Bethea" <co...@siftscience.com> wrote: > Following on--is there a way that I can forcibly replicate these blocks, > perhaps by rsyncing the underlying files to other datanodes? As you might > imagine under-replicated data makes me very uneasy. > > > On Wed, Jan 8, 2014 at 12:00 PM, Cooper Bethea <co...@siftscience.com > >wrote: > > > Hi HDFS developers, > > > > I have a worrying problem in a 2.0.0-cdh4.4.0 HDFS cluster I am running. > 9 > > blocks in the cluster are persistently reported to be under-replicated > per > > "hdfs fsck". > > > > I am able to fetch the files that contain these blocks, so I know that > the > > data is there, but for some reason replication is not taking effect. In > > hopes of getting the cluster to notice that there were under-replicated > > blocks I tried using "hdfs dfs -setrep" to raise the replication factor, > > but the cluster continues to report a single replica for each of these > > blocks. When viewing master logs I see that the replication factor change > > is respected, but there are no messages that refer to the > under-replicated > > blocks. > > > > Thanks for your time. Please let me know what I can do to investigate > > further. > > >