Re: Hadoop use direct I/O in Linux?

Brian Bockelman Mon, 03 Jan 2011 17:05:42 -0800

On Jan 3, 2011, at 5:17 PM, Christopher Smith wrote:

> On Mon, Jan 3, 2011 at 11:40 AM, Brian Bockelman <bbock...@cse.unl.edu>wrote:
> 
>> It's not immediately clear to me the size of the benefit versus the costs.
>> Two cases where one normally thinks about direct I/O are:
>> 1) The usage scenario is a cache anti-pattern.  This will be true for some
>> Hadoop use cases (MapReduce), not true for some others.
>> - http://www.jeffshafer.com/publications/papers/shafer_ispass10.pdf
>> 2) The application manages its own cache.  Not applicable.
>> Atom processors, which you mention below, will just exacerbate (1) due to
>> the small cache size.
>> 
> 
> Actually, assuming you thrash the cache anyway, having a smaller cache can
> often be a good thing. ;-)
>


Assuming no other thread wants to use that poor cache you are thrashing ;)

> 
>> All-in-all, doing this specialization such that you don't hurt the general
>> case is going to be tough.
> 
> 
> For the Hadoop case, the advantages of O_DIRECT would seem to be
> comparatively petty to using O_APPEND and/or MMAP (yes, I realize this is
> not quite the same as what you are proposing, but it seems close enough for
> most cases.. Your best case for a win is when you have reasonably random
> access to a file, and then something else that would benefit from more logve

Actually, our particular site would greatly benefit from O_DIRECT - we have 
non-MapReduce clients with a highly non-repetitive, random read I/O pattern 
with an actively managed application-level read-ahead (note: because we're 
almost guaranteed to wait for a disk seek - 2PB of SSDs are a touch pricey, the 
latency overheads of Java are not actually too important).  The OS page cache 
is mostly useless for us as the working set size is on the order of a few 
hundred TB.

However, I wouldn't actively clamor for O_DIRECT support, but could probably do 
wonders with a HDFS-equivalent to fadvise.  I really don't want to get into the 
business of managing buffering in my application code any more than we already 
do.

Brian

PS - if there are bored folks wanting to do something beneficial to 
high-performance HDFS, I'd note that currently it is tough to get >1Gbps 
performance from a single Hadoop client transferring multiple files.  However, 
HP labs had a clever approach: 
http://www.hpl.hp.com/techreports/2009/HPL-2009-345.pdf .  I'd love to see a 
generic, easy-to-use API to do this.

smime.p7s
Description: S/MIME cryptographic signature

Re: Hadoop use direct I/O in Linux?

Reply via email to