Hello,
I don't know which mailing list is better for this question, so I like
to forward my questions to this mailing list.
If no one is thinking of doing direct IO in Hadoop, I will do it myself.
I have located the code, but the thing is that I'm not familiar with the
environment of compiling Hadoop. I can use jposix, but I don't know how
to integrate it to Hadoop (jposix uses JNI). Any instructions to do it?
Thank you,
Da
-------- Original Message --------
Subject: Hadoop use direct I/O in Linux?
Date: Sun, 02 Jan 2011 15:01:18 -0500
From: Da Zheng <[email protected]>
To: [email protected]
Hello,
direct IO can make huge performance difference, especially when Atom processors
are used. but as far as I know, hadoop doesn't enable direct IO of Linux. Does
anyone know any unofficial versions were developed to use direct IO?
I googled it, and found FUSE provides an option for direct IO. If I use FUSE DFS
and enable direct IO, will I get what I want? i.e., when I write data to HDFS,
the data is written to the disk directly (no caching by any file systems)? or
this direct IO option only allows me to bypass the caching in FUSE and the data
is still cached by the underlying FS?
Best,
Da