Hi Stack,

Why don’t we look at the design of what is being proposed?  Let us post the 
design to HDFS-9924 and then if needed, by all means let us open a new Jira.
That will make it easy to understand the context if someone is looking at 
HDFS-9924.

I personally believe that it should be the developers of the feature that 
should decide what goes in, what to call the branch etc. But It would be nice 
to have
some sort of continuity of HDFS-9924.

Thanks
Anu

From: <saint....@gmail.com> on behalf of Stack <st...@duboce.net>
Date: Thursday, May 3, 2018 at 9:04 PM
To: Anu Engineer <aengin...@hortonworks.com>
Cc: Wei-Chiu Chuang <weic...@apache.org>, "hdfs-dev@hadoop.apache.org" 
<hdfs-dev@hadoop.apache.org>
Subject: Re: [DISCUSSION] Create a branch to work on non-blocking access to HDFS

Thanks for support Wei-Chiu and Anu.

Thinking more on it, we should just open a new JIRA. HDFS-9924 is an old branch 
with commits we don't need full of commentary that is, ahem, a mite off-topic.  
Duo can attach his design to the new issue. We can cite HDFS-9924 as provenance 
and aggregate the discussion as launching pad for the new effort in new issue.

Hopefully this is agreeable,
Thanks,

S

On Thu, May 3, 2018 at 1:54 PM, Anu Engineer 
<aengin...@hortonworks.com<mailto:aengin...@hortonworks.com>> wrote:
Hi St.ack/Wei-Chiu,

It is very kind of St.Ack to bring this question to HDFS Dev. I think this is a 
good feature to have. As for the branch question,
HDFS-9924 branch is already open, we could just use that and I am +1 on adding 
Duo as a branch committer.

I am not familiar with HBase code base, I am presuming that there will be some 
deviation from the current design
doc posted in HDFS-9924. Would it be make sense to post a new design proposal 
on HDFS-9924?

--Anu



On 5/3/18, 9:29 AM, "Wei-Chiu Chuang" 
<weic...@apache.org<mailto:weic...@apache.org>> wrote:

    Given that HBase 2 uses async output by default, the way that code is
    maintained today in HBase is not sustainable. That piece of code should be
    maintained in HDFS. I am +1 as a participant in both communities.

    On Thu, May 3, 2018 at 9:14 AM, Stack 
<st...@duboce.net<mailto:st...@duboce.net>> wrote:

    > Ok with you lot if a few of us open a branch to work on a non-blocking 
HDFS
    > client?
    >
    > Intent is to finish up the old issue "HDFS-9924 [umbrella] Nonblocking 
HDFS
    > Access". On the foot of this umbrella JIRA is a proposal by the
    > heavy-lifter, Duo Zhang. Over in HBase, we have a limited async DFS client
    > (written by Duo) that we use making Write-Ahead Logs. We call it
    > AsyncFSWAL. It was shipped as the default WAL writer in hbase-2.0.0.
    >
    > Let me quote Duo from his proposal at the base of HDFS-9924:
    >
    > ....We use lots of internal APIs of HDFS to implement the AsyncFSWAL, so 
it
    > is expected that things like HBASE-20244
    > <https://issues.apache.org/jira/browse/HBASE-20244>
    > ["NoSuchMethodException
    > when retrieving private method decryptEncryptedDataEncryptionKey from
    > DFSClient"] will happen again and again.
    >
    > To make life easier, we need to move the async output related code into
    > HDFS. The POC [attached as patch on HDFS-9924] shows that option 3 [1] can
    > work, so I would like to create a feature branch to implement the async 
dfs
    > client. In general I think there are 4 steps:
    >
    > 1. Implement an async rpc client with option 3 [1] described above.
    > 2. Implement the filesystem APIs which only need to connect to NN, such as
    > 'mkdirs'.
    > 3. Implement async file read. The problem is the API. For pread I think a
    > CompletableFuture is enough, the problem is for the streaming read. Need 
to
    > discuss later.
    > 4. Implement async file write. The API will also be a problem, but a more
    > important problem is that, if we want to support fan-out, the current 
logic
    > at DN side will make the semantic broken as we can read uncommitted data
    > very easily. In HBase it is solved by HBASE-14004
    > <https://issues.apache.org/jira/browse/HBASE-14004> but I do not think we
    > should keep the broken behavior in HDFS. We need to find a way to deal 
with
    > it.
    >
    > Comments welcome.
    >
    > Intent is to make a branch named HDFS-9924 (or should we just do a new
    > JIRA?) and to add Duo as a feature branch committer. If all goes well,
    > we'll call for a merge VOTE.
    >
    > Thanks,
    > St.Ack
    >
    > 1.Option 3:  "Use the old protobuf rpc interface and implement a new rpc
    > framework. The benefit is that we also do not need port unification 
service
    > at server side and do not need to maintain two implementations at server
    > side. And one more thing is that we do not need to upgrade protobuf to
    > 3.x."
    >



    --
    A very happy Hadoop contributor


Reply via email to