[jira] [Created] (HADOOP-15094) FileSystem
Steve Loughran created HADOOP-15094: --- Summary: FileSystem Key: HADOOP-15094 URL: https://issues.apache.org/jira/browse/HADOOP-15094 Project: Hadoop Common Issue Type: Improvement Reporter: Steve Loughran Discussion around SPARK-22587 highlights how per-fs notions of a canonical URI make it hard to determine if a file is on a specific filesystem, or, put differently, if two filesystems are equivalent. You can't reliably use this.getUri == that.getUri as it doesn't handle FQDN == unqualified DN, bit you can't do nslookup as HDFS HA doesn't use hosnames. If {{FileSystem.getCanonicalUri()}} were public, then this could be used to compare things consistently. needs: filesystem.md coverage; contract test (two filesystem instances are equal, different filesystems aren't). Or at least: this method never returns null. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-13967) S3ABlockOutputStream to support plugin point for different multipart strategies
[ https://issues.apache.org/jira/browse/HADOOP-13967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-13967. - Resolution: Duplicate Fix Version/s: 3.1.0 > S3ABlockOutputStream to support plugin point for different multipart > strategies > --- > > Key: HADOOP-13967 > URL: https://issues.apache.org/jira/browse/HADOOP-13967 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.0.0-beta1 >Reporter: Steve Loughran >Assignee: Steve Loughran > Fix For: 3.1.0 > > > For 0-rename commits, we need to delay the final commit of a multipart PUT, > instead saving the data needed to build that commit into the s3 bucket. > This means changes to {{S3ABlockOutputStream}} so that it can support > different policies on how to do this, "classic" and "delayed commit". > Having this self contained means we can test it in isolation of anything else. > I'm ignoring the old output stream...we will switch to fast output whenever a > special destination path is encountered -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-13969) S3A to support commit(path) operation, which commits all pending put commits in a path
[ https://issues.apache.org/jira/browse/HADOOP-13969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-13969. - Resolution: Duplicate Assignee: Steve Loughran Fix Version/s: 3.1.0 > S3A to support commit(path) operation, which commits all pending put commits > in a path > -- > > Key: HADOOP-13969 > URL: https://issues.apache.org/jira/browse/HADOOP-13969 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.0.0-beta1 >Reporter: Steve Loughran >Assignee: Steve Loughran > Fix For: 3.1.0 > > > as well as creating and saving data with a pending-commit, s3a needs to add > the actual commit operation. > this would scan a directory, take its pending commits, read them in and > execute them. > issue: what to do on failures? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-13968) S3a FS to support "__magic" path for the special "unmaterialized" writes
[ https://issues.apache.org/jira/browse/HADOOP-13968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-13968. - Resolution: Duplicate Assignee: Steve Loughran Fix Version/s: 3.1.0 > S3a FS to support "__magic" path for the special "unmaterialized" writes > > > Key: HADOOP-13968 > URL: https://issues.apache.org/jira/browse/HADOOP-13968 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.0.0-beta1 >Reporter: Steve Loughran >Assignee: Steve Loughran > Fix For: 3.1.0 > > > S3AFileSystem to add support for a special path, such as > {{.temp_pending_put/}} or similar, which, when used as the base of a path, > indicates that the file is actually to be saved to the parent dir, but only > via a delayed put commit operation. > At the same time, we may need blocks on some normal fileIO ops under these > dirs, especially rename and delete, as this would cause serious problems > including data loss and large bills for pending data. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-15095) S3a committer factory to warn when default FileOutputFormat committer is created
Steve Loughran created HADOOP-15095: --- Summary: S3a committer factory to warn when default FileOutputFormat committer is created Key: HADOOP-15095 URL: https://issues.apache.org/jira/browse/HADOOP-15095 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Reporter: Steve Loughran Priority: Minor The S3ACommitterFactory should warn when the classic FileOutputCommitter is used (i.e. the client is not configured to use a new one). Something like "this committer is neither fast nor guaranteed to be correct. See $URL" where URL is a pointer to something (wiki? hadoop docs?). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Re: VOTE] Release Apache Hadoop 2.7.5 (RC0)
+1 (non-binding) - Verified signatures, MD5, RMD160, SHA* for bin and src tarballs - Built from source on macOS 10.12.6 and RHEL 6.6 - Ran local HDFS cluster, ran basic commands, verified read and write capability. - Ran 3000 node cluster via Dynamometer and do not see significant performance variation from 2.7.4 expectations @Brahma, I was able to find HDFS-12831, HADOOP-14881, and HADOOP-14827 in CHANGES.txt, but agree with you on the others listed. I was, however, able to find all of them in the linked releasenotes.html. Thanks Konstantin! Erik On 12/4/17, 10:50 PM, "Brahma Reddy Battula" wrote: +1 (non-binding), thanks Konstantin for driving this. --Built from the source --Installed 3 Node HA Cluster --Ran basic shell commands --Verified append/snapshot/truncate --Ran sample jobs like pi,wordcount Looks follow commits are missed in changes.txt. MAPREDUCE-6975 HADOOP-14919 HDFS-12596 YARN-7084 HADOOP-14881 HADOOP-14827 HDFS-12832 --Brahma Reddy Battula -Original Message- From: Konstantin Shvachko [mailto:shv.had...@gmail.com] Sent: 02 December 2017 10:13 To: common-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org; yarn-...@hadoop.apache.org Subject: VOTE] Release Apache Hadoop 2.7.5 (RC0) Hi everybody, This is the next dot release of Apache Hadoop 2.7 line. The previous one 2.7.4 was release August 4, 2017. Release 2.7.5 includes critical bug fixes and optimizations. See more details in Release Note: http://home.apache.org/~shv/hadoop-2.7.5-RC0/releasenotes.html The RC0 is available at: http://home.apache.org/~shv/hadoop-2.7.5-RC0/ Please give it a try and vote on this thread. The vote will run for 5 days ending 12/08/2017. My up to date public key is available from: https://dist.apache.org/repos/dist/release/hadoop/common/KEYS Thanks, --Konstantin
Re: VOTE] Release Apache Hadoop 2.7.5 (RC0)
Thanks for the release Konstantin. Verified the following: - Downloaded the tar on Ubuntu and verified the signatures - Deployed pseudo cluster - Sanity checks - Basic hdfs operations - Spark PyWordcount & few MR jobs - Accessed most of the web UI's when accessing the docs(from the tar) was able to notice : - Release Notes, Common, HDFS, MapReduce Changes showing file not found - I observed that changes for all components were not available for 2.7.4 as well ( http://hadoop.apache.org/docs/r2.7.4/hadoop-project-dist/hadoop-common/CHANGES.txt ) So not sure whether its missed or not required, else everything else is fine. Regards, + Naga On Tue, Dec 5, 2017 at 2:50 PM, Brahma Reddy Battula < brahmareddy.batt...@huawei.com> wrote: > +1 (non-binding), thanks Konstantin for driving this. > > > --Built from the source > --Installed 3 Node HA Cluster > --Ran basic shell commands > --Verified append/snapshot/truncate > --Ran sample jobs like pi,wordcount > > > Looks follow commits are missed in changes.txt. > > MAPREDUCE-6975 > HADOOP-14919 > HDFS-12596 > YARN-7084 > HADOOP-14881 > HADOOP-14827 > HDFS-12832 > > > --Brahma Reddy Battula > > -Original Message- > From: Konstantin Shvachko [mailto:shv.had...@gmail.com] > Sent: 02 December 2017 10:13 > To: common-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org; > mapreduce-...@hadoop.apache.org; yarn-...@hadoop.apache.org > Subject: VOTE] Release Apache Hadoop 2.7.5 (RC0) > > Hi everybody, > > This is the next dot release of Apache Hadoop 2.7 line. The previous one > 2.7.4 was release August 4, 2017. > Release 2.7.5 includes critical bug fixes and optimizations. See more > details in Release Note: > http://home.apache.org/~shv/hadoop-2.7.5-RC0/releasenotes.html > > The RC0 is available at: http://home.apache.org/~shv/hadoop-2.7.5-RC0/ > > Please give it a try and vote on this thread. The vote will run for 5 days > ending 12/08/2017. > > My up to date public key is available from: > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS > > Thanks, > --Konstantin >
Re: [VOTE] Merge Absolute resource configuration support in Capacity Scheduler (YARN-5881) to trunk
+1. Skimmed through the design doc and uber patch and seems to be reasonable. This is a welcome addition especially w.r.t. cloud deployments so thanks to everyone who worked on this. On Mon, Dec 4, 2017 at 8:18 PM, Rohith Sharma K S wrote: > +1 > > On Nov 30, 2017 7:26 AM, "Sunil G" wrote: > > > Hi All, > > > > > > Based on the discussion at [1], I'd like to start a vote to merge feature > > branch > > > > YARN-5881 to trunk. Vote will run for 7 days, ending Wednesday Dec 6 at > > 6:00PM PDT. > > > > > > This branch adds support to configure queue capacity as absolute resource > > in > > > > capacity scheduler. This will help admins who want fine control of > > resources of queues. > > > > > > Feature development is done at YARN-5881 [2], jenkins build is here > > (YARN-7510 [3]). > > > > All required tasks for this feature are committed. This feature changes > > RM’s Capacity Scheduler only, > > > > and we did extensive tests for the feature in the last couple of months > > including performance tests. > > > > > > Key points: > > > > - The feature is turned off by default, and have to configure absolute > > resource to enable same. > > > > - Detailed documentation about how to use this feature is done as part of > > [4]. > > > > - No major performance degradation is observed with this branch work. SLS > > and UT performance > > > > tests are done. > > > > > > There were 11 subtasks completed for this feature. > > > > > > Huge thanks to everyone who helped with reviews, commits, guidance, and > > > > technical discussion/design, including Wangda Tan, Vinod Vavilapalli, > > Rohith Sharma K S, Eric Payne . > > > > > > [1] : > > http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201711.mbox/% > > 3CCACYiTuhKhF1JCtR7ZFuZSEKQ4sBvN_n_tV5GHsbJ3YeyJP%2BP4Q% > > 40mail.gmail.com%3E > > > > [2] : https://issues.apache.org/jira/browse/YARN-5881 > > > > [3] : https://issues.apache.org/jira/browse/YARN-7510 > > > > [4] : https://issues.apache.org/jira/browse/YARN-7533 > > > > > > Regards > > > > Sunil and Wangda > > >
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/614/ [Dec 4, 2017 6:40:11 PM] (xiao) HDFS-12396. Webhdfs file system should get delegation token from kms [Dec 4, 2017 8:11:00 PM] (eyang) YARN-6669. Implemented Kerberos security for YARN service framework. [Dec 4, 2017 9:14:55 PM] (rkanter) YARN-5594. Handle old RMDelegationToken format when recovering RM [Dec 4, 2017 10:39:43 PM] (mackrorysd) HADOOP-15058. create-release site build outputs dummy shaded jars due to [Dec 5, 2017 5:02:04 AM] (arp) HADOOP-14976. Set HADOOP_SHELL_EXECNAME explicitly in scripts. [Dec 5, 2017 5:30:46 AM] (aajisaka) HADOOP-14985. Remove subversion related code from VersionInfoMojo.java. [Dec 5, 2017 12:58:31 PM] (sunilg) YARN-7586. Application Placement should be done before ACL checks in [Dec 5, 2017 2:11:07 PM] (sunilg) YARN-7092. Render application specific log under application tab in new [Dec 5, 2017 2:23:46 PM] (brahma) HDFS-11751. DFSZKFailoverController daemon exits with wrong status code. [Dec 5, 2017 3:05:41 PM] (stevel) HADOOP-15071 S3a troubleshooting docs to add a couple more failure [Dec 5, 2017 5:20:07 PM] (sunilg) YARN-7438. Additional changes to make SchedulingPlacementSet agnostic to [Dec 5, 2017 7:06:32 PM] (fabbri) HADOOP-14475 Metrics of S3A don't print out when enabled. Contributed by [Dec 5, 2017 9:09:49 PM] (wangda) YARN-7381. Enable the configuration: [Dec 6, 2017 2:40:33 AM] (aajisaka) HDFS-12889. Router UI is missing robots.txt file. Contributed by Bharat [Dec 6, 2017 4:01:36 AM] (zhengkai.zk) HADOOP-15039. Move SemaphoredDelegatingExecutor to hadoop-common. [Dec 6, 2017 4:21:52 AM] (wwei) YARN-7611. Node manager web UI should display container type in [Dec 6, 2017 4:48:16 AM] (xiao) HDFS-12872. EC Checksum broken when BlockAccessToken is enabled. [Dec 6, 2017 9:52:41 AM] (wwei) YARN-7610. Extend Distributed Shell to support launching job with -1 overall The following subsystems voted -1: asflicense findbugs unit The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api org.apache.hadoop.yarn.api.records.Resource.getResources() may expose internal representation by returning Resource.resources At Resource.java:by returning Resource.resources At Resource.java:[line 234] Failed junit tests : hadoop.hdfs.TestDFSStripedOutputStreamWithFailure hadoop.hdfs.TestDFSStripedOutputStreamWithFailure170 hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer hadoop.hdfs.TestFileChecksum hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 hadoop.hdfs.web.TestWebHdfsTimeouts hadoop.hdfs.TestDFSStripedOutputStreamWithFailure190 hadoop.fs.TestUnbuffer hadoop.hdfs.server.balancer.TestBalancerRPCDelay hadoop.hdfs.TestErasureCodingPolicies hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure hadoop.hdfs.server.namenode.TestDecommissioningStatus hadoop.hdfs.TestReconstructStripedFile hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation hadoop.yarn.client.api.impl.TestAMRMClientOnRMRestart hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator hadoop.mapreduce.v2.TestUberAM cc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/614/artifact/out/diff-compile-cc-root.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/614/artifact/out/diff-compile-javac-root.txt [280K] checkstyle: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/614/artifact/out/diff-checkstyle-root.txt [17M] pylint: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/614/artifact/out/diff-patch-pylint.txt [20K] shellcheck: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/614/artifact/out/diff-patch-shellcheck.txt [20K] shelldocs: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/614/artifact/out/diff-patch-shelldocs.txt [12K] whitespace: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/614/artifact/out/whitespace-eol.txt [8.8M] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/614/artifact/out/whitespace-tabs.txt [288K] findbugs: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/614/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-
[jira] [Created] (HADOOP-15096) start-build-env.sh can create a docker image that fills up disk
Addison Higham created HADOOP-15096: --- Summary: start-build-env.sh can create a docker image that fills up disk Key: HADOOP-15096 URL: https://issues.apache.org/jira/browse/HADOOP-15096 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 3.1.0 Reporter: Addison Higham start-build-env.sh has the potential to build an image that can fill up root disks by exploding a sparse file. In my case, the right ingredients are: Ubuntu 17.04 Docker 17.09.0 AUFS storage driver userId and groupid with a high number This happens when building the hadoop-build-${USER_ID} image, specifically in the {code} RUN useradd -g ${GROUP_ID} -u ${USER_ID} -k /root -m ${USER_NAME} {code} command. The reason for this: /var/log/lastlog is a sparse file that pre-reserves based on highest seen UID and GID, in my case, those numbers are very high (above 1 billion). Locally, this result in a sparse file that reports as 443 GB. However, under docker and specifically AUFS, it appears that his file *isn't* sparse and it tries to allocate the whole file. If you start this script and walk away to wait for it to finish, you come back to a computer with a completely full disk. Luckily, the fix is quite easy, simply add the `-l` option to useradd which won't create those files -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Reopened] (HADOOP-15012) Add readahead, dropbehind, and unbuffer to StreamCapabilities
[ https://issues.apache.org/jira/browse/HADOOP-15012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen reopened HADOOP-15012: > Add readahead, dropbehind, and unbuffer to StreamCapabilities > - > > Key: HADOOP-15012 > URL: https://issues.apache.org/jira/browse/HADOOP-15012 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 2.9.0 >Reporter: John Zhuge >Assignee: John Zhuge > Fix For: 3.1.0 > > Attachments: HADOOP-15012.branch-2.01.patch > > > A split from HADOOP-14872 to track changes that enhance StreamCapabilities > class with READAHEAD, DROPBEHIND, and UNBUFFER capability. > Discussions and code reviews are done in HADOOP-14872. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org