[jira] Created: (HADOOP-6313) Expose flush APIs to application users

Hairong Kuang (JIRA) Wed, 14 Oct 2009 10:20:56 -0700

Expose flush APIs to application users
--------------------------------------


                 Key: HADOOP-6313
                 URL: https://issues.apache.org/jira/browse/HADOOP-6313
             Project: Hadoop Common
          Issue Type: New Feature
          Components: fs
            Reporter: Hairong Kuang
            Assignee: Hairong Kuang
             Fix For: 0.21.0


Earlier this year, Yahoo, Facebook, and Hbase developers had a roundtable 
discussion where we agreed to support three types of flush in HDFS (API1, 2, 
and 3) and the append project aims to implement API2. Here is a proposal to 
expose these APIs to application users.
1. Three flush APIs
* API1: flushes out from the address space of client into the socket to the 
data nodes.   On the return of the call there is no guarantee that that data is 
out of the underlying node and no guarantee of having reached a DN.  New 
readers will eventually see this data if there are no failures.
* API2: flushes out to all replicas of the block. The data is in the buffers of 
the DNs but not on the DN's OS buffers.  New readers will see the data after 
the call has returned. 
* API3: flushes out to all replicas and all replicas have done posix fsync 
equivalent - ie the OS has flushed it to the disk device (but the disk may have 
it in its cache).

2. Support flush APIs in FS
* FSDataOutputStream#flush supports API1
* FSDataOutputStream implements Syncable interface defined below. If its 
wrapped output stream (i.e. each file system's stream) is Syncable, 
FSDataOutputStream#hflush() and hsync() call its wrapped output stream's hflush 
& hsync.
{noformat}
  public interface Syncable {
    public void hflush() throws IOException;  // support API2
    public void hsync() throws IOException;   // support API3
  }
{noformat}
* In each file system, if only hflush() is implemented, hsync() by default 
calls hflush().  If only hsync() is implemented, hflush() by default calls 
flush().

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-6313) Expose flush APIs to application users

Reply via email to