Hi

Chris Douglas I and I've have a proposal for a short-lived feature branch for 
the Azure ABFS connector to go into the hadoop-azure package. This will connect 
to the new azure storage service, which will ultimately replace the one used by 
wasb. It's a big patch and, like all storage connectors, will inevitably take 
time to stabilize (i.e: nobody ever get seek() right, even when we think we 
have).

Thomas & Esfandiar will do the coding: they've already done the paperwork. 
Chris, myself & anyone else interested can be involved in the review and 
testing.

Comments?

-------------

The initial HADOOP-15407 patch contains a new filesystem client for the 
forthcoming Azure ABFS, which is intended to replace Azure WASB as the Azure 
storage layer. The patch is large, as it contains the replacement client, 
tests, and generated code.

We propose a feature branch, so the module can be broken into salient, 
reviewable chunks. Internal constraints prevented this feature from being 
developed in Apache, so we want to ensure that all the code is discussed, 
maintainable, and documented by the community before it merges.

To effect this, we also propose adding two developers as branch committers: 
Thomas Marquardt tm...@microsoft.com<mailto:tm...@microsoft.com> Esfandiar 
Manii esma...@microsoft.com<mailto:esma...@microsoft.com>

Beyond normal feature branch activity and merge criteria for FS modules, we 
want to add another merge criterion for ABFS. Some of the client APIs are not 
GA. It seems reasonable to require that this client works with public endpoints 
before it merges to trunk.

To test the Blob FS driver, Blob FS team (including Esfandiar Manii and Thomas 
Marquardt) in Azure Storage will need the MSDN subscription ID(s) for all 
reviewers who want to run the tests. The ABFS team will then whitelist the 
subscription ID(s) for the Blob FS Preview. At that time, future storage 
accounts created will have the Blob FS endpoint, 
<accountName>.dfs.core.windows.net<http://dfs.core.windows.net>, which the Blob 
FS driver relies on.

This is a temporary state during the (current) Private Preview and the early 
phases of Public Preview. In a few months, the whitelisting will not be 
required and anyone will be able to create a storage account with access to the 
Blob FS endpoint.

Thomas and Esfandiar have been active in the Hadoop project working on the WASB 
connector (see https://issues.apache.org/jira/browse/HADOOP-14552). They 
understand the processes and requirements of the software. Working on the 
branch directly will let them bring this significant feature into the 
hadoop-azure module without disrupting existing users.

Reply via email to