Re: [VOTE] Merge HDFS-9806 to trunk

Sean Mackrory Wed, 13 Dec 2017 12:02:45 -0800

+1 from me. There are some unrelated errors building the branch right now
due to annotations in some YARN code, etc. but I was able to generate an fs
image from an S3 bucket and serve the content through HDFS on a
pseudo-distributed HDFS node this morning. Seems like a good point for a
merge.


On Wed, Dec 13, 2017 at 11:55 AM, Anu Engineer <aengin...@hortonworks.com>
wrote:

> Hi Virajith / Chris/ Thomas / Ewan,
>
> Thanks for developing this feature and getting to merge state.
> I would like to vote +1 for this merge. Thanks for all the hard work.
>
> Thanks
> Anu
>
>
> On 12/8/17, 7:11 PM, "Virajith Jalaparti" <virajit...@gmail.com> wrote:
>
>     Hi,
>
>     We have tested the HDFS-9806 branch in two settings:
>
>     (i) 26 node bare-metal cluster, with PROVIDED storage configured to
> point
>     to another instance of HDFS (containing 468 files, total of ~400GB of
>     data). Half of the Datanodes are configured with only DISK volumes and
>     other other half have both DISK and PROVIDED volumes.
>     (ii) 8 VMs on Azure, with PROVIDED storage configured to point to a
> WASB
>     account (containing 26,074 files and ~1.3TB of data). All Datanodes are
>     configured with DISK and PROVIDED volumes.
>
>     (i) was tested using both the text-based alias map
> (TextFileRegionAliasMap)
>     and the in-memory leveldb-based alias map (
> InMemoryLevelDBAliasMapClient),
>     while (ii) was tested using the text-based alias map only.
>
>     Steps followed:
>     (0) Build from apache/HDFS-9806. (Note that for the leveldb-based alias
>     map, the patch posted to HDFS-12912
>     <https://issues.apache.org/jira/browse/HDFS-12912> needs to be
> applied; we
>     will commit this to apache/HDFS-9806 after review).
>     (1) Generate the FSImage using the image generation tool with the
>     appropriate remote location (hdfs:// in (i) and wasb:// in (ii)).
>     (2) Bring up the HDFS cluster.
>     (3) Verify that the remote namespace is reflected correctly and data on
>     remote store can be accessed. Commands ran: ls, copyToLocal, fsck,
> getrep,
>     setrep, getStoragePolicy
>     (4) Run Sort and Gridmix jobs on the data in the remote location with
> the
>     input paths pointing to the local HDFS.
>     (5) Increase replication of the PROVIDED files and verified that local
>     (DISK) replicas were created for the PROVIDED replicas, using fsck.
>     (6) Verify that Provided storage capacity is shown correctly on the NN
> and
>     Datanode Web-UI.
>     (7) Bring down datanodes, one by one. When all are down, verify NN
> reports
>     all PROVIDED files as missing. Bringing back up any one Datanode makes
> all
>     the data available.
>     (8) Restart NN and verify data is still accesible.
>     (9) Verify that Writes to local HDFS continue to work.
>     (10) Bring down all Datanodes except one. Start decommissioning the
>     remaining Datanode. Verify that the data in the PROVIDED storage is
> still
>     accessible.
>
>     Apart from the above, we ported the changes in HDFS-9806 to branch-2.7
> and
>     deployed it on a ~800 node cluster as one of the sub-clusters in a
>     Router-based Federated HDFS of nearly 4000 nodes (with help from Inigo
>     Goiri). We mounted about 1000 files, 650TB of remote data (~2.6million
>     blocks with 256MB block size) in this cluster using the text-based
> alias
>     map. We verified that the basic commands (ls, copyToLocal, setrep)
> work.
>     We also ran spark jobs against this cluster.
>
>     -Virajith
>
>
>     On Fri, Dec 8, 2017 at 3:44 PM, Chris Douglas <cdoug...@apache.org>
> wrote:
>
>     > Discussion thread: https://s.apache.org/kxT1
>     >
>     > We're down to the last few issues and are preparing the branch to
>     > merge to trunk. We'll post merge patches to HDFS-9806 [1]. Minor,
>     > "cleanup" tasks (checkstyle, findbugs, naming, etc.) will be tracked
>     > in HDFS-12712 [2].
>     >
>     > We've tried to ensure that when this feature is disabled, HDFS is
>     > unaffected. For those reviewing this, please look for places where
>     > this might add overheads and we'll address them before the merge. The
>     > site documentation [3] and design doc [4] should be up to date and
>     > sufficient to try this out. Again, please point out where it is
>     > unclear and we can address it.
>     >
>     > This has been a long effort and we're grateful for the support we've
>     > received from the community. In particular, thanks to Íñigo Goiri,
>     > Andrew Wang, Anu Engineer, Steve Loughran, Sean Mackrory, Lukas
>     > Majercak, Uma Gunuganti, Kai Zheng, Rakesh Radhakrishnan, Sriram Rao,
>     > Lei Xu, Zhe Zhang, Jing Zhao, Bharat Viswanadham, ATM, Chris Nauroth,
>     > Sanjay Radia, Atul Sikaria, and Peng Li for all your input into the
>     > design, testing, and review of this feature.
>     >
>     > The vote will close no earlier than one week from today, 12/15. -C
>     >
>     > [1]: https://issues.apache.org/jira/browse/HDFS-9806
>     > [2]: https://issues.apache.org/jira/browse/HDFS-12712
>     > [3]: https://github.com/apache/hadoop/blob/HDFS-9806/hadoop-
>     > hdfs-project/hadoop-hdfs/src/site/markdown/HdfsProvidedStorage.md
>     > [4]: https://issues.apache.org/jira/secure/attachment/
>     > 12875791/HDFS-9806-design.002.pdf
>     >
>     > ------------------------------------------------------------
> ---------
>     > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>     > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>     >
>     >
>
>
>

Re: [VOTE] Merge HDFS-9806 to trunk

Reply via email to