Re: replication in HDFS

2011-11-01 Thread Zheng Da
e processing. > > --Bobby Evans > > On 10/31/11 2:50 PM, "Zheng Da" wrote: > > Hello Ram, > > Sorry, I didn't notice your reply. > > I don't really have a complete design in my mind. I wonder if the > community is interested in using an altern

Re: replication in HDFS

2011-10-31 Thread Zheng Da
tiple source blocks to be ready, so the writer will need to > buffer the original data, either in memory or on disk. If it is saved on > disk because of memory pressure, will this be similar to writing the file > with replication 2? > > Ram > > > On Thu, Oct 13, 2011 at 1:16 AM,

replication in HDFS

2011-10-12 Thread Zheng Da
Hello all, Right now HDFS is still using simple replication to increase data reliability. Even though it works, it just wastes the disk space, network and disk bandwidth. For data-intensive applications (that needs to write large result to the HDFS), it just limits the throughput of MapReduce. Als