Apache Hadoop qbt Report: branch-2.10+JDK7 on Linux/x86_64
For more details, see https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/804/ No changes -1 overall The following subsystems voted -1: asflicense hadolint mvnsite pathlen unit The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: Failed junit tests : hadoop.fs.TestFileUtil hadoop.hdfs.server.datanode.TestDirectoryScanner hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys hadoop.hdfs.server.namenode.TestFSImageWithAcl hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints hadoop.hdfs.server.federation.router.TestRouterNamenodeHeartbeat hadoop.hdfs.server.federation.router.TestRouterQuota hadoop.hdfs.server.federation.resolver.TestMultipleDestinationResolver hadoop.hdfs.server.federation.resolver.order.TestLocalResolver hadoop.yarn.server.nodemanager.containermanager.linux.resources.TestNumaResourceAllocator hadoop.yarn.server.nodemanager.containermanager.linux.resources.TestNumaResourceHandlerImpl hadoop.yarn.server.resourcemanager.TestClientRMService hadoop.yarn.server.resourcemanager.monitor.invariants.TestMetricsInvariantChecker hadoop.mapreduce.jobhistory.TestHistoryViewerPrinter hadoop.mapreduce.lib.input.TestLineRecordReader hadoop.mapred.TestLineRecordReader hadoop.tools.TestDistCpSystem hadoop.yarn.sls.TestSLSRunner hadoop.resourceestimator.solver.impl.TestLpSolver hadoop.resourceestimator.service.TestResourceEstimatorService cc: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/804/artifact/out/diff-compile-cc-root.txt [4.0K] javac: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/804/artifact/out/diff-compile-javac-root.txt [488K] checkstyle: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/804/artifact/out/diff-checkstyle-root.txt [14M] hadolint: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/804/artifact/out/diff-patch-hadolint.txt [4.0K] mvnsite: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/804/artifact/out/patch-mvnsite-root.txt [568K] pathlen: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/804/artifact/out/pathlen.txt [12K] pylint: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/804/artifact/out/diff-patch-pylint.txt [20K] shellcheck: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/804/artifact/out/diff-patch-shellcheck.txt [72K] whitespace: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/804/artifact/out/whitespace-eol.txt [12M] https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/804/artifact/out/whitespace-tabs.txt [1.3M] javadoc: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/804/artifact/out/patch-javadoc-root.txt [40K] unit: https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/804/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt [220K] https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/804/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt [444K] https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/804/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs_src_contrib_bkjournal.txt [16K] https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/804/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt [36K] https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/804/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt [20K] https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/804/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt [72K] https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/804/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt [116K] https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/804/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt [104K] https://ci-hadoop.apache.or
[jira] [Created] (HDFS-16792) Add -newEditsOnly option to name node initializeSharedEdits command to format unformatted journal nodes for master replacement
Ashutosh Gupta created HDFS-16792: - Summary: Add -newEditsOnly option to name node initializeSharedEdits command to format unformatted journal nodes for master replacement Key: HDFS-16792 URL: https://issues.apache.org/jira/browse/HDFS-16792 Project: Hadoop HDFS Issue Type: New Feature Components: journal-node Affects Versions: 3.3.4, 3.3.3 Reporter: Ashutosh Gupta Assignee: Ashutosh Gupta Add -newEditsOnly option to name node initializeSharedEdits command to format unformatted journal nodes for master replacement. The current hadoop has limitations on formatting journal nodes. First, it requires all journal nodes to be present when formatting name node. After that, new journal nodes could not be formatted unless we reformat all of them and destroy all HDFS edit logs. $ hdfs namenode -initializeSharedEdits -newEditsOnly There are two cases here: 1) the replaced master instance has name node running on it, then the above command will only be run once on the existing name node. 2) the replaced master instance does not have a name node running on it, then the above command will be run twice. But the above command is idempotent as long as the same command will not be issued at the same time, which could be guaranteed by instance controller when reconfiguring master instances. The reason is that all hadoop name node commands such as format, bootstrap, and initializeSharedEdits are not designed to be thread safe and they rely on an external mechanism, which is a manual work for hadoop admin for on-premise clusters, to make sure that they are not executed concurrently. We still keep the same behavior of the initializeSharedEdits command to minimize the code/logic changes. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64
For more details, see https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1003/ No changes -1 overall The following subsystems voted -1: blanks pathlen xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: XML : Parsing Error(s): hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml cc: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1003/artifact/out/results-compile-cc-root.txt [96K] javac: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1003/artifact/out/results-compile-javac-root.txt [532K] blanks: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1003/artifact/out/blanks-eol.txt [14M] https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1003/artifact/out/blanks-tabs.txt [2.0M] checkstyle: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1003/artifact/out/results-checkstyle-root.txt [14M] pathlen: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1003/artifact/out/results-pathlen.txt [16K] pylint: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1003/artifact/out/results-pylint.txt [20K] shellcheck: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1003/artifact/out/results-shellcheck.txt [28K] xml: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1003/artifact/out/xml.txt [24K] javadoc: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1003/artifact/out/results-javadoc-javadoc-root.txt [400K] Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-16793) ObserverNameNode fails to select streaming inputStream with a timeout exception
ZanderXu created HDFS-16793: --- Summary: ObserverNameNode fails to select streaming inputStream with a timeout exception Key: HDFS-16793 URL: https://issues.apache.org/jira/browse/HDFS-16793 Project: Hadoop HDFS Issue Type: Improvement Reporter: ZanderXu Assignee: ZanderXu In out prod environment, we encountered one case that observer namenode failed to select streaming inputStream with a timeout exception. And the related code as bellow: {code:java} @Override public void selectInputStreams(Collection estreams, long fromTxnId, boolean inProgressOk, boolean onlyDurableTxns) throws IOException { if (inProgressOk && inProgressTailingEnabled) { ... } // Timeout here. selectStreamingInputStreams(streams, fromTxnId, inProgressOk, onlyDurableTxns); } {code} After looked into the code and found that JournalNode contains one very expensive and redundant operation that scan all of edits of the last in-progress segment with IO. The related code as bellow: {code:java} public List getRemoteEditLogs(long firstTxId, boolean inProgressOk) throws IOException { File currentDir = sd.getCurrentDir(); List allLogFiles = matchEditLogs(currentDir); List ret = Lists.newArrayListWithCapacity( allLogFiles.size()); for (EditLogFile elf : allLogFiles) { if (elf.hasCorruptHeader() || (!inProgressOk && elf.isInProgress())) { continue; } // Here. if (elf.isInProgress()) { try { elf.scanLog(getLastReadableTxId(), true); } catch (IOException e) { LOG.error("got IOException while trying to validate header of " + elf + ". Skipping.", e); continue; } } if (elf.getFirstTxId() >= firstTxId) { ret.add(new RemoteEditLog(elf.firstTxId, elf.lastTxId, elf.isInProgress())); } else if (elf.getFirstTxId() < firstTxId && firstTxId <= elf.getLastTxId()) { // If the firstTxId is in the middle of an edit log segment. Return this // anyway and let the caller figure out whether it wants to use it. ret.add(new RemoteEditLog(elf.firstTxId, elf.lastTxId, elf.isInProgress())); } } Collections.sort(ret); return ret; } {code} Expensive: * This scan operation will scan all of the edits of the in-progress segment with IO. Redundant: * This scan operation just find the lastTxId of this in-progress segment * But the caller method getEditLogManifest(long sinceTxId, boolean inProgressOk) in Journal.java just ignore the lastTxId of the in-progress segment and use getHighestWrittenTxId() as the lastTxId of the in-progress and return to namenode. * So, the scan operation is redundant. If end user enable the Observer Read feature, the delay of the tailing edits from journalnode is very important, whether it is normal process or fallback process. And there is no more comments about this scan logic after looked into the code and HDFS-6634 which added this logic. The only effect I can get is to scan the in-progress segment for corruption. But namenode can handle the corrupted in-progress segment. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org