Have you tried TestDFSIO? I think it's quite a good benchmark to measure the
performance of HDFS. If you want to know how to write data to HDFS directly, you
can read its code.
Da
On 1/28/11 6:36 PM, Pei HE wrote:
> Hi all,
> I want to know the detailed performance of Hadoop.
>
> I am writing a
On 01/05/2011 12:44 AM, Christopher Smith wrote:
Yes, my C program can reach 100MB/s or even 110MB/s when writing data to
the
disk sequentially, but with direct I/O enabled, the maximal throughput is
about
140MB/s. But the biggest difference is CPU usage.
Without direct I/O, operating system uses
> On Jan 5, 2011, at 3:42 PM, Da Zheng wrote:
>
>> I'm not sure of that. I wrote a small checksum program for testing. After
>> the size of a block gets to larger than 8192 bytes, I don't see much
>> performance improvement. See the code below. I don
I'm not sure of that. I wrote a small checksum program for testing.
After the size of a block gets to larger than 8192 bytes, I don't see
much performance improvement. See the code below. I don't think 64MB can
bring us any benefit.
I did change io.bytes.per.checksum to 131072 in hadoop, and the
won't see
> much, if any performance improvement.
>
>
>
> -Original Message-
> From: Da Zheng [mailto:zhen...@cs.jhu.edu]
> Sent: Tuesday, January 04, 2011 11:11 PM
> To: common-dev@hadoop.apache.org
> Subject: Re: Hadoop use direct I/O in Linux?
>
>
On 1/5/11 12:44 AM, Christopher Smith wrote:
> On Tue, Jan 4, 2011 at 9:11 PM, Da Zheng wrote:
>
>> On 1/4/11 5:17 PM, Christopher Smith wrote:
>>> If you use direct I/O to reduce CPU time, that means you are saving CPU
>> via
>>> DMA. If you are using J
ore processor with hyperthread enabled).
Cpu(s): 3.4%us, 32.8%sy, 0.0%ni, 50.0%id, 12.1%wa, 0.0%hi, 1.6%si, 0.0%st
But with direct I/O, the system time can be as little as 3%.
Best,
Da
>
> On Tue, Jan 4, 2011 at 9:58 AM, Da Zheng wrote:
>
>> The most important reason for
The most important reason for me to use direct I/O is that the Atom
processor is too weak. If I wrote a simple program to write data to the
disk, CPU is almost 100% but the disk hasn't reached its maximal
bandwidth. When I write data to SSD, the difference is even larger. Even
if the program ha
ronment of compiling Hadoop. I can use jposix, but I don't know how
to integrate it to Hadoop (jposix uses JNI). Any instructions to do it?
Thank you,
Da
Original Message
Subject:Hadoop use direct I/O in Linux?
Date: Sun, 02 Jan 2011 15:01:18 -0500
From: Da Zheng
T
ronment of compiling Hadoop. I can use jposix, but I don't know how
to integrate it to Hadoop (jposix uses JNI). Any instructions to do it?
Thank you,
Da
Original Message
Subject:Hadoop use direct I/O in Linux?
Date: Sun, 02 Jan 2011 15:01:18 -0500
From: Da Zheng
T
10 matches
Mail list logo