You are mixing a few things up. You're testing your I/O using C. What do you see if you try testing your direct I/O from Java? I'm guessing that you'll keep your i/o piece in place and wrap it within some JNI code and then re-write the test in Java?
Also are you testing large streams or random i/o blocks? (Hopefully both) I think that when you test out the system, you'll find that you won't see much, if any performance improvement. -----Original Message----- From: Da Zheng [mailto:zhen...@cs.jhu.edu] Sent: Tuesday, January 04, 2011 11:11 PM To: common-dev@hadoop.apache.org Subject: Re: Hadoop use direct I/O in Linux? On 1/4/11 5:17 PM, Christopher Smith wrote: > If you use direct I/O to reduce CPU time, that means you are saving CPU via > DMA. If you are using Java's heap though, you can kiss that goodbye. The buffer for direct I/O cannot be allocated from Java's heap anyway, I don't understand what you mean? > > That said, I'm surprised that the Atom can't keep up with magnetic disk > unless you have a striped array. 100MB/s shouldn't be too taxing. Is it > possible you're doing something wrong or your CPU is otherwise occupied? Yes, my C program can reach 100MB/s or even 110MB/s when writing data to the disk sequentially, but with direct I/O enabled, the maximal throughput is about 140MB/s. But the biggest difference is CPU usage. Without direct I/O, operating system uses a lot of CPU time (the data below is got with top, and this is a dual-core processor with hyperthread enabled). Cpu(s): 3.4%us, 32.8%sy, 0.0%ni, 50.0%id, 12.1%wa, 0.0%hi, 1.6%si, 0.0%st But with direct I/O, the system time can be as little as 3%. Best, Da > > On Tue, Jan 4, 2011 at 9:58 AM, Da Zheng <zhen...@cs.jhu.edu> wrote: > >> The most important reason for me to use direct I/O is that the Atom >> processor is too weak. If I wrote a simple program to write data to the >> disk, CPU is almost 100% but the disk hasn't reached its maximal bandwidth. >> When I write data to SSD, the difference is even larger. Even if the program >> has saturated the two cores of the CPU, it cannot even get to the half of >> the maximal bandwidth of SSD. >> >> I don't know how much benefit direct I/O can bring to the normal processor >> such as Xeon, but I have a feeling I have to use direct I/O in order to have >> good performance on Atom processors. >> >> Best, >> Da > > The information contained in this communication may be CONFIDENTIAL and is intended only for the use of the recipient(s) named above. If you are not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please notify the sender and delete/destroy the original message and any copy of it from your computer or paper files.