[jira] [Created] (HDFS-16301) Improve BenchmarkThroughput#SIZE naming standardization

2021-11-05 Thread JiangHua Zhu (Jira)
JiangHua Zhu created HDFS-16301:
---

 Summary: Improve BenchmarkThroughput#SIZE naming standardization
 Key: HDFS-16301
 URL: https://issues.apache.org/jira/browse/HDFS-16301
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: benchmarks, test
Affects Versions: 2.9.2
Reporter: JiangHua Zhu


In the BenchmarkThroughput#run() method, there is a local variable: SIZE. This 
variable is used in a local scope, and it may be more appropriate to change it 
to a lowercase name.
public int run(String[] args) throws IOException {
..
long SIZE = conf.getLong("dfsthroughput.file.size",
 10L * 1024 * 1024 * 1024);
..
}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: branch-2.10+JDK7 on Linux/x86_64

2021-11-05 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/

[Nov 4, 2021 7:57:35 AM] (Takanobu Asanuma) HDFS-16294.Remove invalid 
DataNode#CONFIG_PROPERTY_SIMULATED. (#3605)




-1 overall


The following subsystems voted -1:
asflicense hadolint mvnsite pathlen unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

Failed junit tests :

   hadoop.fs.TestFileUtil 
   hadoop.hdfs.server.datanode.TestDataNodeUUID 
   hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys 
   
hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain 
   hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   hadoop.hdfs.server.federation.router.TestRouterNamenodeHeartbeat 
   hadoop.hdfs.server.federation.router.TestRouterQuota 
   hadoop.hdfs.server.federation.resolver.order.TestLocalResolver 
   hadoop.hdfs.server.federation.resolver.TestMultipleDestinationResolver 
   
hadoop.yarn.server.resourcemanager.monitor.invariants.TestMetricsInvariantChecker
 
   hadoop.yarn.server.resourcemanager.TestClientRMService 
   hadoop.mapreduce.jobhistory.TestHistoryViewerPrinter 
   hadoop.mapreduce.lib.input.TestLineRecordReader 
   hadoop.mapred.TestLineRecordReader 
   hadoop.tools.TestDistCpSystem 
   hadoop.yarn.sls.TestSLSRunner 
   hadoop.resourceestimator.service.TestResourceEstimatorService 
   hadoop.resourceestimator.solver.impl.TestLpSolver 
  

   cc:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/artifact/out/diff-compile-javac-root.txt
  [496K]

   checkstyle:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/artifact/out/diff-checkstyle-root.txt
  [14M]

   hadolint:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/artifact/out/diff-patch-hadolint.txt
  [4.0K]

   mvnsite:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/artifact/out/patch-mvnsite-root.txt
  [584K]

   pathlen:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/artifact/out/diff-patch-pylint.txt
  [48K]

   shellcheck:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/artifact/out/diff-patch-shellcheck.txt
  [56K]

   shelldocs:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/artifact/out/diff-patch-shelldocs.txt
  [48K]

   whitespace:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/artifact/out/whitespace-eol.txt
  [12M]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/artifact/out/whitespace-tabs.txt
  [1.3M]

   javadoc:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/artifact/out/patch-javadoc-root.txt
  [32K]

   unit:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
  [232K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [428K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs_src_contrib_bkjournal.txt
  [12K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
  [40K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt
  [20K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
  [128K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt
  [104K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/472/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt
  [104K]
   
https://ci-hadoop.apach

[jira] [Created] (HDFS-16302) RBF: RouterRpcFairnessPolicyController record requests handled by each nameservice

2021-11-05 Thread Janus Chow (Jira)
Janus Chow created HDFS-16302:
-

 Summary: RBF: RouterRpcFairnessPolicyController record requests 
handled by each nameservice
 Key: HDFS-16302
 URL: https://issues.apache.org/jira/browse/HDFS-16302
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Janus Chow
Assignee: Janus Chow


In HDFS-16296, we added metrics to record rejected permits for each namespace, 
and it would be also valuable to record the handled requests for each namespace.

This ticket is to also record the handled requests by each namespace.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16303) Losing over 100 datanodes in state decommissioning results in full blockage of all datanode decommissioning

2021-11-05 Thread Kevin Wikant (Jira)
Kevin Wikant created HDFS-16303:
---

 Summary: Losing over 100 datanodes in state decommissioning 
results in full blockage of all datanode decommissioning
 Key: HDFS-16303
 URL: https://issues.apache.org/jira/browse/HDFS-16303
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.3.1, 2.10.1
Reporter: Kevin Wikant


## Problem Description

The HDFS Namenode class "DatanodeAdminManager" is responsible for 
decommissioning datanodes.

As per this "hdfs-site" configuration:
{quote}Config = dfs.namenode.decommission.max.concurrent.tracked.nodes 
Default Value = 100

The maximum number of decommission-in-progress datanodes nodes that will be 
tracked at one time by the namenode. Tracking a decommission-in-progress 
datanode consumes additional NN memory proportional to the number of blocks on 
the datnode. Having a conservative limit reduces the potential impact of 
decomissioning a large number of nodes at once. A value of 0 means no limit 
will be enforced.
{quote}
The Namenode will only actively track up to 100 datanodes for decommissioning 
at any given time, as to avoid Namenode memory pressure.

Looking into the "DatanodeAdminManager" code:
 * a new datanode is only removed from the "tracked.nodes" set when it finishes 
decommissioning
 * a new datanode is only added to the "tracked.nodes" set if there is fewer 
than 100 datanodes being tracked

So in the event that there are more than 100 datanodes being decommissioned at 
a given time, some of those datanodes will not be in the "tracked.nodes" set 
until 1 or more datanodes in the "tracked.nodes" finishes decommissioning. This 
is generally not a problem because the datanodes in "tracked.nodes" will 
eventually finish decommissioning, but there is an edge case where this logic 
prevents the namenode from making any forward progress towards decommissioning.

If all 100 datanodes in the "tracked.nodes" are unable to finish 
decommissioning, then other datanodes (which may be able to be decommissioned) 
will never get added to "tracked.nodes" and therefore will never get the 
opportunity to be decommissioned.

This can occur due the following issue:
{quote}2021-10-21 12:39:24,048 WARN 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager 
(DatanodeAdminMonitor-0): Node W.X.Y.Z:50010 is dead while in Decommission In 
Progress. Cannot be safely decommissioned or be in maintenance since there is 
risk of reduced data durability or data loss. Either restart the failed node or 
force decommissioning or maintenance by removing, calling refreshNodes, then 
re-adding to the excludes or host config files.
{quote}
If a Datanode is lost while decommissioning (for example if the underlying 
hardware fails or is lost), then it will remain in state decommissioning 
forever.

If 100 or more Datanodes are lost while decommissioning over the Hadoop cluster 
lifetime, then this is enough to completely fill up the "tracked.nodes" set. 
With the entire "tracked.nodes" set filled with datanodes that can never finish 
decommissioning, any datanodes added after this point will never be able to be 
decommissioned because they will never be added to the "tracked.nodes" set.

In this scenario:
 * the "tracked.nodes" set is filled with datanodes which are lost & cannot be 
recovered (and can never finish decommissioning so they will never be removed 
from the set)
 * the actual live datanodes being decommissioned are enqueued waiting to enter 
the "tracked.nodes" set (and are stuck waiting indefinitely)

This means that no progress towards decommissioning the live datanodes will be 
made unless the user takes the following action:
{quote}Either restart the failed node or force decommissioning or maintenance 
by removing, calling refreshNodes, then re-adding to the excludes or host 
config files.
{quote}
Ideally, the Namenode should be able to gracefully handle scenarios where the 
datanodes in the "tracked.nodes" set are not making forward progress towards 
decommissioning while the enqueued datanodes may be able to make forward 
progress.

## Reproduction Steps
 * create a Hadoop cluster
 * lose (i.e. terminate the host/process forever) over 100 datanodes while the 
datanodes are in state decommissioning
 * add additional datanodes to the cluster
 * attempt to decommission those new datanodes & observe that they are stuck in 
state decommissioning forever

Note that in this example each datanode, over the full history of the cluster, 
has a unique IP address



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64

2021-11-05 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/679/

[Nov 4, 2021 1:16:03 AM] (noreply) HDFS-16266. Add remote port information to 
HDFS audit log (#3538)
[Nov 4, 2021 1:51:11 AM] (noreply) HDFS-16291.Make the comment of 
INode#ReclaimContext more standardized. (#3602)
[Nov 4, 2021 4:47:41 AM] (noreply) HADOOP-17374. support listObjectV2 (#3587)
[Nov 4, 2021 7:43:25 AM] (noreply) HDFS-16294.Remove invalid 
DataNode#CONFIG_PROPERTY_SIMULATED. (#3605)
[Nov 4, 2021 2:40:37 PM] (noreply) HADOOP-17873. ABFS: Fix transient failures 
in ITestAbfsStreamStatistics and ITestAbfsRestOperationException (#3341)
[Nov 4, 2021 4:19:11 PM] (noreply) HDFS-16300. Use libcrypto in Windows for 
libhdfspp (#3617)




-1 overall


The following subsystems voted -1:
blanks compile golang mvninstall mvnsite pathlen spotbugs unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml
 

Failed junit tests :

   hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes 
   hadoop.hdfs.server.diskbalancer.command.TestDiskBalancerCommand 
   hadoop.hdfs.TestHDFSFileSystemContract 
   hadoop.fs.http.client.TestHttpFSFWithSWebhdfsFileSystem 
   hadoop.fs.http.client.TestHttpFSFWithWebhdfsFileSystem 
   hadoop.mapreduce.v2.TestUberAM 
   hadoop.mapreduce.v2.TestMRJobsWithProfiler 
   hadoop.tools.util.TestDistCpUtils 
   hadoop.tools.mapred.lib.TestDynamicInputFormat 
   hadoop.yarn.csi.client.TestCsiClient 
   hadoop.tools.dynamometer.TestDynamometerInfra 
   hadoop.tools.dynamometer.TestDynamometerInfra 
  

   mvninstall:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/679/artifact/out/patch-mvninstall-root.txt
 [1.7M]

   compile:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/679/artifact/out/patch-compile-root.txt
 [532K]

   cc:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/679/artifact/out/patch-compile-root.txt
 [532K]

   golang:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/679/artifact/out/patch-compile-root.txt
 [532K]

   javac:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/679/artifact/out/patch-compile-root.txt
 [532K]

   blanks:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/679/artifact/out/blanks-eol.txt
 [13M]
  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/679/artifact/out/blanks-tabs.txt
 [2.0M]

   checkstyle:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/679/artifact/out/results-checkstyle-root.txt
 [14M]

   mvnsite:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/679/artifact/out/patch-mvnsite-root.txt
 [508K]

   pathlen:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/679/artifact/out/results-pathlen.txt
 [16K]

   pylint:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/679/artifact/out/results-pylint.txt
 [20K]

   shellcheck:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/679/artifact/out/results-shellcheck.txt
 [28K]

   xml:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/679/artifact/out/xml.txt
 [24K]

   javadoc:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/679/artifact/out/patch-javadoc-root.txt
 [3.8M]

   spotbugs:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/679/artifact/out/branch-spotbugs-hadoop-tools_hadoop-azure.txt
 [4.0K]
  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/679/artifact/out/branch-spotbugs-hadoop-tools.txt
 [136K]
  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/679/artifact/out/branch-spotbugs-root.