(sorry for the delay in replying)

Hi Zheng

You are right about HDFS RAID. It is used to save space, and is not involved
in the file write path. The generation of parity blocks and reducing
replication factor happens after a configurable amount of time.

What is the design you have in mind? When the HDFS file is being written,
the data is generated block-by-block. But generating parity blocks will
require multiple source blocks to be ready, so the writer will need to
buffer the original data, either in memory or on disk. If it is saved on
disk because of memory pressure, will this be similar to writing the file
with replication 2?

Ram


On Thu, Oct 13, 2011 at 1:16 AM, Zheng Da <zhengda1...@gmail.com> wrote:

> Hello all,
>
> Right now HDFS is still using simple replication to increase data
> reliability. Even though it works, it just wastes the disk space,
> network and disk bandwidth. For data-intensive applications (that
> needs to write large result to the HDFS), it just limits the
> throughput of MapReduce. Also it's very energy-inefficient.
>
> Is the community trying to use erasure code to increase data
> reliability? I know someone is working on HDFS-RAID, but it can only
> solve the problem in disk space. In many case, network and disk
> bandwidth are more important, which are the factors limiting the
> throughput of MapReduce. Has anyone tried to use erasure code to
> reduce the size of data when data is written to HDFS? I know reducing
> replications might hurt the read performance, but I think it's still
> important to reduce the data size writing to HDFS.
>
> Thanks,
> Da
>

Reply via email to