On Apr 9, 2009, at 5:45 PM, Stas Oskin wrote:
Hi.
I have 2 questions about HDFS performance:
1) How fast are the read and write operations over network, in Mbps
per
second?
Depends. What hardware? How much hardware? Is the cluster under
load? What does your I/O load look like? As a rule of thumb, you'll
probably expect very close to hardware speed.
At this instant, our cluster is doing about 10k reads / sec, each read
is 128KB. About 1.2GB / s. The max we've recorded on this cluster is
8GB/s. http://rcf.unl.edu/ganglia/?c=red-workers However, you
probably don't care much about that unless you get to run on our
hardware ;)
Unfortunately, this question is kind of like asking "How fast is
Linux?" There's just so many different performance characteristics
that giving a good answer is always tough. My recommendation is to
take an afternoon and play around with it.
2) If the chunk server is located on same host as the client, is
there any
optimization in read operations?
For example, Kosmos FS describe the following functionality:
"Localhost optimization: One copy of data
is placed on the chunkserver on the same
host as the client doing the write
Helps reduce network traffic"
Yes, this optimization is performed.
Brian