[jira] [Reopened] (HDFS-14591) NameNode should move the replicas to the correct storages after the storage policy is changed.
[ https://issues.apache.org/jira/browse/HDFS-14591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena reopened HDFS-14591: - Reopened and closing again as duplicate rather than fixed. > NameNode should move the replicas to the correct storages after the storage > policy is changed. > -- > > Key: HDFS-14591 > URL: https://issues.apache.org/jira/browse/HDFS-14591 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > > Our Xiaomi HDFS has a cluster storaging both HOT and COLD data. We have a > backgroud process searching all the files to find those that are not accessed > for a period of time. Then we set them to COLD and start a mover to move the > replicas. After moving, all the replicas are consistent with the storage > policy. > It's a natural idea to let the NameNode handle the move. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-14591) NameNode should move the replicas to the correct storages after the storage policy is changed.
[ https://issues.apache.org/jira/browse/HDFS-14591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-14591. - Resolution: Duplicate > NameNode should move the replicas to the correct storages after the storage > policy is changed. > -- > > Key: HDFS-14591 > URL: https://issues.apache.org/jira/browse/HDFS-14591 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > > Our Xiaomi HDFS has a cluster storaging both HOT and COLD data. We have a > backgroud process searching all the files to find those that are not accessed > for a period of time. Then we set them to COLD and start a mover to move the > replicas. After moving, all the replicas are consistent with the storage > policy. > It's a natural idea to let the NameNode handle the move. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Re: HDFS Performance Improvement JIRAs: reviewers wanted
Thanks Wei-Chiu for your great work. All JIRAs listed is very valuable and I would like to try my best to participate to review and give some feedback. Another side, I think there are also some helpful JIRAs but not digged up. Does the spreadsheet support to insert more candidate JIRAs about performance? (to Wei-Chiu) Some other discussion, a. I suggest that we should go through all JIRAs regularly and report some performance improvement JIRAs, Of course it really takes up lots of time, and I believe many guys/contributors would like to participate in. Meanwhile it may be good topic for community sync up (cc @Wangda). b. Beyond that, I think we should also scan some BUG JIRAs (for instance HDFS-12862) reported but not fixed up to now. Thanks Wei-Chiu again. Best Regards, Hexiaoqiao On Sat, Jun 22, 2019 at 11:47 AM Wei-Chiu Chuang wrote: > I spent the past week going through most of the jiras with a patch attached > in the past, and turned up some really good stuff to helps improve HDFS > performance. > > The list of jiras are listed in the following spreadsheet. If you are > interested in reviewing those jiras, please update the following > spreadsheet and add you as a reviewer. A reviewer does not need to be a > Hadoop committer, but it helps to give the author the feedback. > > > https://docs.google.com/spreadsheets/d/1dvLoZ039ZirdZF9p0wWKhFCtD91jfbdkPg4XZ-AnMNg/edit?usp=sharing > > I am doing this exercise to identify known performance limitations + fixes > submitted but never got committed. There are cases where patch was reviewed > or even blessed with +1, but didn't pushed to the repo; there are cases > where good ideas never got reviewed. > > I think this is the low hanging fruit that we as a community should do. > > I use this filter to search for Hadoop/HDFS patches, if you are interested: > > https://issues.apache.org/jira/issues/?filter=12311124&jql=project%20in%20(HADOOP%2C%20HDFS)%20AND%20status%20%3D%20%22Patch%20Available%22%20ORDER%20BY%20updated%20DESC%2C%20key%20DESC > > Best, > Wei-Chiu >
Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/360/ [Jun 21, 2019 2:36:44 AM] (weichiu) HDFS-14465. When the Block expected replications is larger than the [Jun 21, 2019 4:06:06 AM] (weichiu) HDFS-14303. check block directory logic not correct when there is only -1 overall The following subsystems voted -1: asflicense findbugs hadolint pathlen unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: XML : Parsing Error(s): hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml FindBugs : module:hadoop-common-project/hadoop-common Class org.apache.hadoop.fs.GlobalStorageStatistics defines non-transient non-serializable instance field map In GlobalStorageStatistics.java:instance field map In GlobalStorageStatistics.java FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client Boxed value is unboxed and then immediately reboxed in org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result, byte[], byte[], KeyConverter, ValueConverter, boolean) At ColumnRWHelper.java:then immediately reboxed in org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result, byte[], byte[], KeyConverter, ValueConverter, boolean) At ColumnRWHelper.java:[line 335] Failed junit tests : hadoop.hdfs.server.datanode.TestRefreshNamenodes hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA hadoop.hdfs.web.TestWebHdfsTimeouts hadoop.hdfs.server.datanode.TestDirectoryScanner hadoop.yarn.client.api.impl.TestAMRMProxy hadoop.registry.secure.TestSecureLogins hadoop.yarn.server.resourcemanager.TestRMRestart hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 cc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/360/artifact/out/diff-compile-cc-root-jdk1.7.0_95.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/360/artifact/out/diff-compile-javac-root-jdk1.7.0_95.txt [328K] cc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/360/artifact/out/diff-compile-cc-root-jdk1.8.0_212.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/360/artifact/out/diff-compile-javac-root-jdk1.8.0_212.txt [308K] checkstyle: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/360/artifact/out/diff-checkstyle-root.txt [16M] hadolint: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/360/artifact/out/diff-patch-hadolint.txt [4.0K] pathlen: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/360/artifact/out/pathlen.txt [12K] pylint: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/360/artifact/out/diff-patch-pylint.txt [24K] shellcheck: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/360/artifact/out/diff-patch-shellcheck.txt [72K] shelldocs: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/360/artifact/out/diff-patch-shelldocs.txt [8.0K] whitespace: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/360/artifact/out/whitespace-eol.txt [12M] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/360/artifact/out/whitespace-tabs.txt [1.2M] xml: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/360/artifact/out/xml.txt [12K] findbugs: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/360/artifact/out/branch-findbugs-hadoop-common-project_hadoop-common-warnings.html [8.0K] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/360/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-hbase_hadoop-yarn-server-timelineservice-hbase-client-warnings.html [8.0K] javadoc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/360/artifact/out/diff-javadoc-javadoc-root-jdk1.7.0_95.txt [16K] https://builds.apache.org/job/hadoop-qbt-branch2-java
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1175/ [Jun 21, 2019 2:57:48 AM] (weichiu) HADOOP-14385. HttpExceptionUtils#validateResponse swallows exceptions. [Jun 21, 2019 3:17:24 AM] (weichiu) HDFS-12564. Add the documents of swebhdfs configurations on the client [Jun 21, 2019 3:23:05 AM] (weichiu) HDFS-13893. DiskBalancer: no validations for Disk balancer commands. [Jun 21, 2019 9:35:43 AM] (elek) HDDS-1674. Make ScmBlockLocationProtocol message type based. [Jun 21, 2019 9:41:05 AM] (wwei) YARN-9209. When nodePartition is not set in Placement Constraints, [Jun 21, 2019 12:14:06 PM] (elek) HDDS-1678. Default image name for kubernetes examples should be ozone [Jun 21, 2019 2:25:10 PM] (elek) HDDS-1715. Update the Intellij runner definitition of SCM to use the new [Jun 21, 2019 5:23:11 PM] (bharat) HDDS-1690. ContainerController should provide a way to retrieve [Jun 22, 2019 12:05:13 AM] (weichiu) HADOOP-15989. Synchronized at CompositeService#removeService is not [Jun 22, 2019 1:17:36 AM] (weichiu) HDFS-12487. FsDatasetSpi.isValidBlock() lacks null pointer check inside [Jun 22, 2019 1:27:03 AM] (weichiu) HDFS-14074. DataNode runs async disk checks maybe throws -1 overall The following subsystems voted -1: asflicense findbugs hadolint pathlen unit The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: FindBugs : module:hadoop-hdfs-project/hadoop-hdfs Redundant nullcheck of block, which is known to be non-null in org.apache.hadoop.hdfs.server.datanode.DiskBalancer$DiskBalancerMover.getBlockToCopy(FsVolumeSpi$BlockIterator, DiskBalancerWorkItem) Redundant null check at DiskBalancer.java:is known to be non-null in org.apache.hadoop.hdfs.server.datanode.DiskBalancer$DiskBalancerMover.getBlockToCopy(FsVolumeSpi$BlockIterator, DiskBalancerWorkItem) Redundant null check at DiskBalancer.java:[line 914] FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-documentstore Unread field:TimelineEventSubDoc.java:[line 56] Unread field:TimelineMetricSubDoc.java:[line 44] FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-mawo/hadoop-yarn-applications-mawo-core Class org.apache.hadoop.applications.mawo.server.common.TaskStatus implements Cloneable but does not define or use clone method At TaskStatus.java:does not define or use clone method At TaskStatus.java:[lines 39-346] Equals method for org.apache.hadoop.applications.mawo.server.worker.WorkerId assumes the argument is of type WorkerId At WorkerId.java:the argument is of type WorkerId At WorkerId.java:[line 114] org.apache.hadoop.applications.mawo.server.worker.WorkerId.equals(Object) does not check for null argument At WorkerId.java:null argument At WorkerId.java:[lines 114-115] Failed junit tests : hadoop.hdfs.web.TestWebHdfsTimeouts hadoop.hdfs.server.diskbalancer.TestDiskBalancer hadoop.hdfs.server.datanode.TestDirectoryScanner hadoop.mapreduce.v2.app.TestRuntimeEstimators hadoop.hdds.scm.pipeline.TestSCMPipelineManager hadoop.ozone.container.common.statemachine.commandhandler.TestBlockDeletion hadoop.ozone.client.rpc.TestOzoneRpcClient hadoop.hdds.scm.pipeline.TestRatisPipelineProvider hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis hadoop.ozone.client.rpc.TestSecureOzoneRpcClient hadoop.ozone.client.rpc.TestFailureHandlingByClient hadoop.ozone.client.rpc.TestOzoneAtRestEncryption cc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1175/artifact/out/diff-compile-cc-root.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1175/artifact/out/diff-compile-javac-root.txt [332K] checkstyle: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1175/artifact/out/diff-checkstyle-root.txt [17M] hadolint: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1175/artifact/out/diff-patch-hadolint.txt [8.0K] pathlen: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1175/artifact/out/pathlen.txt [12K] pylint: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1175/artifact/out/diff-patch-pylint.txt [120K] shellcheck: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1175/artifact/out/diff-patch-shellcheck.txt [20K] shelldocs: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1175/artifact/out/diff-patch-shelldocs.txt [44K] whitespace: https://builds.