I see. In order to beat 3 replications in the perspective of IO, we
need to generate parity blocks carefully. We can't simply buffer
source blocks on the local disk and then generate parity blocks, which
requires two extra disk IOs.
One problem with parity blocks is that parity blocks can't work as
Our largest cluster is several thousand nodes and we still run with a
replication factor of 3. We have not seen any benefit from having a larger
replication factor except when it is a resource that lots of machines will use,
aka distributed cache. Other then that 3 seems just fine for most map
Hello Ram,
Sorry, I didn't notice your reply.
I don't really have a complete design in my mind. I wonder if the
community is interested in using an alternative scheme to support data
reliability and if the community plans to do it.
You are right, we might need to buffer the source blocks on the
(sorry for the delay in replying)
Hi Zheng
You are right about HDFS RAID. It is used to save space, and is not involved
in the file write path. The generation of parity blocks and reducing
replication factor happens after a configurable amount of time.
What is the design you have in mind? When t
Hello all,
Right now HDFS is still using simple replication to increase data
reliability. Even though it works, it just wastes the disk space,
network and disk bandwidth. For data-intensive applications (that
needs to write large result to the HDFS), it just limits the
throughput of MapReduce. Als