Re: [VOTE] Hadoop 3.1.x EOL

2021-06-10 Thread Akira Ajisaka
This vote has passed with 18 binding +1. I'll update the JIRA and the wiki.

Thanks all for your participation.

On Tue, Jun 8, 2021 at 3:03 AM Steve Loughran  wrote:
>
>
>
> On Thu, 3 Jun 2021 at 07:14, Akira Ajisaka  wrote:
>>
>> Dear Hadoop developers,
>>
>> Given the feedback from the discussion thread [1], I'd like to start
>> an official vote
>> thread for the community to vote and start the 3.1 EOL process.
>>
>> What this entails:
>>
>> (1) an official announcement that no further regular Hadoop 3.1.x releases
>> will be made after 3.1.4.
>> (2) resolve JIRAs that specifically target 3.1.5 as won't fix.
>>
>> This vote will run for 7 days and conclude by June 10th, 16:00 JST [2].
>>
>> Committers are eligible to cast binding votes. Non-committers are welcomed
>> to cast non-binding votes.
>>
>> Here is my vote, +1
>
>
>
> +1 (binding)
>>
>>

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15272) Backport HDFS-12862 to branch-3.1

2021-06-10 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-15272.
--
Resolution: Won't Fix

branch-3.1 is EoL. Closing as won't fix.

> Backport HDFS-12862 to branch-3.1
> -
>
> Key: HDFS-15272
> URL: https://issues.apache.org/jira/browse/HDFS-15272
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.4
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15272.branch-3.1.001.patch
>
>
> Backport HDFS-12862 CacheDirective becomes invalid when NN restart or 
> failover to branch-3.1.4



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-15272) Backport HDFS-12862 to branch-3.1

2021-06-10 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reopened HDFS-15272:
--

> Backport HDFS-12862 to branch-3.1
> -
>
> Key: HDFS-15272
> URL: https://issues.apache.org/jira/browse/HDFS-15272
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.4
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15272.branch-3.1.001.patch
>
>
> Backport HDFS-12862 CacheDirective becomes invalid when NN restart or 
> failover to branch-3.1.4



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-15653) dfshealth.html#tab-overview is not working

2021-06-10 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka reopened HDFS-15653:
--

> dfshealth.html#tab-overview is not working
> --
>
> Key: HDFS-15653
> URL: https://issues.apache.org/jira/browse/HDFS-15653
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.3.0
> Environment: CentOS 7.4
> HDFS 3.3.0
>Reporter: Yakir Gibraltar
>Priority: Major
>  Labels: web-console, web-dashboard
> Attachments: image-2020-10-26-20-10-29-419.png, 
> image-2020-10-26-20-10-35-947.png
>
>
> Hi, in version 3.3.0, the URL of 
> http://:/dfshealth.html#tab-overview is broken.
>  The error in "Developer tools":
> {code:java}
> dfs-dust.js:121 Uncaught TypeError: $.get(...).error is not a function
> at Object. (dfs-dust.js:121)
> at Function.each (jquery-3.4.1.min.js:2)
> at load_json (dfs-dust.js:111)
> at load_overview (dfshealth.js:99)
> at load_page (dfshealth.js:452)
> at dfshealth.js:459
> at dfshealth.js:464
> (anonymous) @ dfs-dust.js:121
> each @ jquery-3.4.1.min.js:2
> load_json @ dfs-dust.js:111
> load_overview @ dfshealth.js:99
> load_page @ dfshealth.js:452
> (anonymous) @ dfshealth.js:459
> (anonymous) @ dfshealth.js:464
> {code}
> !image-2020-10-26-20-10-35-947.png!
>  
> Thank you, Yakir Gibraltar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15653) dfshealth.html#tab-overview is not working

2021-06-10 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-15653.
--
Resolution: Not A Problem

> dfshealth.html#tab-overview is not working
> --
>
> Key: HDFS-15653
> URL: https://issues.apache.org/jira/browse/HDFS-15653
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.3.0
> Environment: CentOS 7.4
> HDFS 3.3.0
>Reporter: Yakir Gibraltar
>Priority: Major
>  Labels: web-console, web-dashboard
> Attachments: image-2020-10-26-20-10-29-419.png, 
> image-2020-10-26-20-10-35-947.png
>
>
> Hi, in version 3.3.0, the URL of 
> http://:/dfshealth.html#tab-overview is broken.
>  The error in "Developer tools":
> {code:java}
> dfs-dust.js:121 Uncaught TypeError: $.get(...).error is not a function
> at Object. (dfs-dust.js:121)
> at Function.each (jquery-3.4.1.min.js:2)
> at load_json (dfs-dust.js:111)
> at load_overview (dfshealth.js:99)
> at load_page (dfshealth.js:452)
> at dfshealth.js:459
> at dfshealth.js:464
> (anonymous) @ dfs-dust.js:121
> each @ jquery-3.4.1.min.js:2
> load_json @ dfs-dust.js:111
> load_overview @ dfshealth.js:99
> load_page @ dfshealth.js:452
> (anonymous) @ dfshealth.js:459
> (anonymous) @ dfshealth.js:464
> {code}
> !image-2020-10-26-20-10-35-947.png!
>  
> Thank you, Yakir Gibraltar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 3.3.1 RC3

2021-06-10 Thread Viraj Jasani
+1 (non-binding)

* Signature: ok
* Checksum : ok
* Rat check (1.8.0_171): ok
 - mvn clean apache-rat:check
* Built from source (1.8.0_171): ok
 - mvn clean install  -DskipTests
* HDFS basic testing in pseudo-distributed mode: ok
* Built HBase 2.4.4 with Hadoop 3.3.1 RC and tested some basic scenarios,
looks good

On Wed, Jun 9, 2021 at 10:55 PM Stack  wrote:

> +1
>
>
>
> * Signature: ok
>
> * Checksum : ok
>
> * Rat check (1.8.0_191): ok
>
>  - mvn clean apache-rat:check
>
> * Built from source (1.8.0_191): ok
>
>  - mvn clean install -DskipTests
>
>
> Ran a ten node cluster w/ hbase on top running its verification loadings w/
> (gentle) chaos. Had trouble getting the rig running but mostly pilot error
> and none that I could particularly attribute to hdfs after poking in logs.
>
> Messed in UI and shell some. Nothing untoward.
>
> Wei-Chiu fixed broke tests over in hbase and complete runs are pretty much
> there (a classic flakie seems more-so on 3.3.1... will dig in more on why).
>
>
> Thanks,
>
> S
>
>
> On Tue, Jun 1, 2021 at 3:29 AM Wei-Chiu Chuang  wrote:
>
> > Hi community,
> >
> > This is the release candidate RC3 of Apache Hadoop 3.3.1 line. All
> blocker
> > issues have been resolved [1] again.
> >
> > There are 2 additional issues resolved for RC3:
> > * Revert "MAPREDUCE-7303. Fix TestJobResourceUploader failures after
> > HADOOP-16878
> > * Revert "HADOOP-16878. FileUtil.copy() to throw IOException if the
> source
> > and destination are the same
> >
> > There are 4 issues resolved for RC2:
> > * HADOOP-17666. Update LICENSE for 3.3.1
> > * MAPREDUCE-7348. TestFrameworkUploader#testNativeIO fails. (#3053)
> > * Revert "HADOOP-17563. Update Bouncy Castle to 1.68. (#2740)" (#3055)
> > * HADOOP-17739. Use hadoop-thirdparty 1.1.1. (#3064)
> >
> > The Hadoop-thirdparty 1.1.1, as previously mentioned, contains two extra
> > fixes compared to hadoop-thirdparty 1.1.0:
> > * HADOOP-17707. Remove jaeger document from site index.
> > * HADOOP-17730. Add back error_prone
> >
> > *RC tag is release-3.3.1-RC3
> > https://github.com/apache/hadoop/releases/tag/release-3.3.1-RC3
> >
> > *The RC3 artifacts are at*:
> > https://home.apache.org/~weichiu/hadoop-3.3.1-RC3/
> > ARM artifacts: https://home.apache.org/~weichiu/hadoop-3.3.1-RC3-arm/
> >
> > *The maven artifacts are hosted here:*
> > https://repository.apache.org/content/repositories/orgapachehadoop-1320/
> >
> > *My public key is available here:*
> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >
> >
> > Things I've verified:
> > * all blocker issues targeting 3.3.1 have been resolved.
> > * stable/evolving API changes between 3.3.0 and 3.3.1 are compatible.
> > * LICENSE and NOTICE files checked
> > * RELEASENOTES and CHANGELOG
> > * rat check passed.
> > * Built HBase master branch on top of Hadoop 3.3.1 RC2, ran unit tests.
> > * Built Ozone master on top fo Hadoop 3.3.1 RC2, ran unit tests.
> > * Extra: built 50 other open source projects on top of Hadoop 3.3.1 RC2.
> > Had to patch some of them due to commons-lang migration (Hadoop 3.2.0)
> and
> > dependency divergence. Issues are being identified but so far nothing
> > blocker for Hadoop itself.
> >
> > Please try the release and vote. The vote will run for 5 days.
> >
> > My +1 to start,
> >
> > [1] https://issues.apache.org/jira/issues/?filter=12350491
> > [2]
> >
> >
> https://github.com/apache/hadoop/compare/release-3.3.1-RC1...release-3.3.1-RC3
> >
>


[jira] [Created] (HDFS-16062) When a DataNode hot reload configuration, JMX card master for a long time

2021-06-10 Thread JiangHua Zhu (Jira)
JiangHua Zhu created HDFS-16062:
---

 Summary: When a DataNode hot reload configuration, JMX card master 
for a long time
 Key: HDFS-16062
 URL: https://issues.apache.org/jira/browse/HDFS-16062
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: JiangHua Zhu






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: branch-2.10+JDK7 on Linux/x86_64

2021-06-10 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/

No changes




-1 overall


The following subsystems voted -1:
asflicense hadolint mvnsite pathlen unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

Failed junit tests :

   hadoop.fs.TestTrash 
   hadoop.fs.TestFileUtil 
   hadoop.crypto.key.kms.server.TestKMS 
   hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   
hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain 
   hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys 
   hadoop.hdfs.server.datanode.TestBlockRecovery 
   hadoop.fs.http.client.TestHttpFSFWithSWebhdfsFileSystem 
   hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   hadoop.hdfs.server.federation.router.TestRouterQuota 
   hadoop.hdfs.server.federation.router.TestRouterNamenodeHeartbeat 
   hadoop.hdfs.server.federation.resolver.order.TestLocalResolver 
   hadoop.hdfs.server.federation.resolver.TestMultipleDestinationResolver 
   hadoop.yarn.server.resourcemanager.TestClientRMService 
   
hadoop.yarn.server.resourcemanager.monitor.invariants.TestMetricsInvariantChecker
 
   hadoop.yarn.client.api.impl.TestAMRMClient 
   hadoop.mapreduce.jobhistory.TestHistoryViewerPrinter 
   hadoop.yarn.sls.TestSLSRunner 
   hadoop.resourceestimator.solver.impl.TestLpSolver 
   hadoop.resourceestimator.service.TestResourceEstimatorService 
  

   cc:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/diff-compile-javac-root.txt
  [476K]

   checkstyle:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/diff-checkstyle-root.txt
  [16M]

   hadolint:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/diff-patch-hadolint.txt
  [4.0K]

   mvnsite:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/patch-mvnsite-root.txt
  [560K]

   pathlen:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/diff-patch-pylint.txt
  [48K]

   shellcheck:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/diff-patch-shellcheck.txt
  [56K]

   shelldocs:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/diff-patch-shelldocs.txt
  [48K]

   whitespace:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/whitespace-eol.txt
  [12M]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/whitespace-tabs.txt
  [1.3M]

   javadoc:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/diff-javadoc-javadoc-root.txt
  [20K]

   unit:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
  [212K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/patch-unit-hadoop-common-project_hadoop-kms.txt
  [48K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [444K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-httpfs.txt
  [20K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs_src_contrib_bkjournal.txt
  [12K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
  [40K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
  [124K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
  [16K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/325/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-

Re: [VOTE] Release Apache Hadoop 3.3.1 RC3

2021-06-10 Thread Masatake Iwasaki

+1

Thanks for the great work, Wei-Chiu Chuang.

* verified signature and checksum.
* built site documentation by `mvn site` and skimmed the contents.
  # found that top-level index.html is not updated.
* built on CentOS 8 (x86_64) and OpenJDK 8 by `mvn install -DskipTests -Pnative 
-Pdist`.
  * launched pseudo cluster with security enabled and ran sample MR jobs.
  * launched 3-nodes cluster with NN-HA and RM-HA and ran sample MR jobs.
* built on CentOS 7 (aarch64) and OpenJDK 8 by `mvn install -DskipTests 
-Pnative -Pdist`.

* built Hive with the patch of HIVE-24484 against hadoop-3.3.1
  and ran TestMiniLlapCliDrivera (fixed by HDFS-15790).

Masatake Iwasaki

On 2021/06/01 19:29, Wei-Chiu Chuang wrote:

Hi community,

This is the release candidate RC3 of Apache Hadoop 3.3.1 line. All blocker
issues have been resolved [1] again.

There are 2 additional issues resolved for RC3:
* Revert "MAPREDUCE-7303. Fix TestJobResourceUploader failures after
HADOOP-16878
* Revert "HADOOP-16878. FileUtil.copy() to throw IOException if the source
and destination are the same

There are 4 issues resolved for RC2:
* HADOOP-17666. Update LICENSE for 3.3.1
* MAPREDUCE-7348. TestFrameworkUploader#testNativeIO fails. (#3053)
* Revert "HADOOP-17563. Update Bouncy Castle to 1.68. (#2740)" (#3055)
* HADOOP-17739. Use hadoop-thirdparty 1.1.1. (#3064)

The Hadoop-thirdparty 1.1.1, as previously mentioned, contains two extra
fixes compared to hadoop-thirdparty 1.1.0:
* HADOOP-17707. Remove jaeger document from site index.
* HADOOP-17730. Add back error_prone

*RC tag is release-3.3.1-RC3
https://github.com/apache/hadoop/releases/tag/release-3.3.1-RC3

*The RC3 artifacts are at*:
https://home.apache.org/~weichiu/hadoop-3.3.1-RC3/
ARM artifacts: https://home.apache.org/~weichiu/hadoop-3.3.1-RC3-arm/

*The maven artifacts are hosted here:*
https://repository.apache.org/content/repositories/orgapachehadoop-1320/

*My public key is available here:*
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS


Things I've verified:
* all blocker issues targeting 3.3.1 have been resolved.
* stable/evolving API changes between 3.3.0 and 3.3.1 are compatible.
* LICENSE and NOTICE files checked
* RELEASENOTES and CHANGELOG
* rat check passed.
* Built HBase master branch on top of Hadoop 3.3.1 RC2, ran unit tests.
* Built Ozone master on top fo Hadoop 3.3.1 RC2, ran unit tests.
* Extra: built 50 other open source projects on top of Hadoop 3.3.1 RC2.
Had to patch some of them due to commons-lang migration (Hadoop 3.2.0) and
dependency divergence. Issues are being identified but so far nothing
blocker for Hadoop itself.

Please try the release and vote. The vote will run for 5 days.

My +1 to start,

[1] https://issues.apache.org/jira/issues/?filter=12350491
[2]
https://github.com/apache/hadoop/compare/release-3.3.1-RC1...release-3.3.1-RC3



-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16063) Add toString to EditLogFileInputStream

2021-06-10 Thread David Mollitor (Jira)
David Mollitor created HDFS-16063:
-

 Summary: Add toString to EditLogFileInputStream
 Key: HDFS-16063
 URL: https://issues.apache.org/jira/browse/HDFS-16063
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: David Mollitor


The class {{EditLogFileInputStream}} is logged at DEBUG level, but has no 
{{toString}} method, so the logging is of limited value.  Also, put the DEBUG 
statement behind some guards since it's printing an unbounded list of items.

https://github.com/apache/hadoop/blob/eefa664fea1119a9c6e3ae2d2ad3069019fbd4ef/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java#L895



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16064) HDFS-721 causes DataNode decommissioning to get stuck indefinitely

2021-06-10 Thread Kevin Wikant (Jira)
Kevin Wikant created HDFS-16064:
---

 Summary: HDFS-721 causes DataNode decommissioning to get stuck 
indefinitely
 Key: HDFS-16064
 URL: https://issues.apache.org/jira/browse/HDFS-16064
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode
Affects Versions: 3.2.1
Reporter: Kevin Wikant


Seems that https://issues.apache.org/jira/browse/HDFS-721 was resolved as a 
non-issue under the assumption that if the namenode & a datanode get into an 
inconsistent state for a given block pipeline, there should be another datanode 
available to replicate the block to

While testing datanode decommissioning using "dfs.exclude.hosts", I have 
encountered a scenario where the decommissioning gets stuck indefinitely

Below is the progression of events:
 * there are initially 4 datanodes DN1, DN2, DN3, DN4
 * scale-down is started by adding DN1 & DN2 to "dfs.exclude.hosts"
 * HDFS block(s) on DN1 & DN2 must now be replicated to DN3 & DN4 in order to 
satisfy their minimum replication factor of 2
 * during this replication process 
https://issues.apache.org/jira/browse/HDFS-721 is encountered which causes the 
following inconsistent state:
 ** DN3 thinks it has the block pipeline in FINALIZED state
 ** the namenode does not think DN3 has the block pipeline

{code:java}
2021-06-06 10:38:23,604 INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
(DataXceiver for client  at /DN2:45654 [Receiving block BP-YYY:blk_XXX]): 
DN3:9866:DataXceiver error processing WRITE_BLOCK operation  src: /DN2:45654 
dst: /DN3:9866; 
org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block 
BP-YYY:blk_XXX already exists in state FINALIZED and thus cannot be created.
{code}
 * the replication is attempted again, but:
 ** DN4 has the block
 ** DN1 and/or DN2 have the block, but don't count towards the minimum 
replication factor because they are being decommissioned
 ** DN3 does not have the block & cannot have the block replicated to it 
because of HDFS-721
 * the namenode repeatedly tries to replicate the block to DN3 & repeatedly 
fails, this continues indefinitely
 * therefore DN4 is the only live datanode with the block & the minimum 
replication factor of 2 cannot be satisfied
 * because the minimum replication factor cannot be satisfied for the block(s) 
being moved off DN1 & DN2, the datanode decommissioning can never be completed

 
{code:java}
2021-06-06 10:39:10,106 INFO BlockStateChange (DatanodeAdminMonitor-0): Block: 
blk_XXX, Expected Replicas: 2, live replicas: 1, corrupt replicas: 0, 
decommissioned replicas: 0, decommissioning replicas: 2, maintenance replicas: 
0, live entering maintenance replicas: 0, excess replicas: 0, Is Open File: 
false, Datanodes having this block: DN1:9866 DN2:9866 DN4:9866 , Current 
Datanode: DN1:9866, Is current datanode decommissioning: true, Is current 
datanode entering maintenance: false
...
2021-06-06 10:57:10,105 INFO BlockStateChange (DatanodeAdminMonitor-0): Block: 
blk_XXX, Expected Replicas: 2, live replicas: 1, corrupt replicas: 0, 
decommissioned replicas: 0, decommissioning replicas: 2, maintenance replicas: 
0, live entering maintenance replicas: 0, excess replicas: 0, Is Open File: 
false, Datanodes having this block: DN1:9866 DN2:9866 DN4:9866 , Current 
Datanode: DN2:9866, Is current datanode decommissioning: true, Is current 
datanode entering maintenance: false
{code}
Being stuck in decommissioning state forever is not an intended behavior of 
DataNode decommissioning

A few potential solutions:
 * Address the root cause of the problem which is an inconsistent state between 
namenode & datanode: https://issues.apache.org/jira/browse/HDFS-721
 * Detect when datanode decommissioning is stuck due to lack of available 
datanodes for satisfying the minimum replication factor, then recover by 
re-enabling the datanodes being decommissioned

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 3.3.1 RC3

2021-06-10 Thread Chao Sun
+1 (non-binding)

- verified signature and checksum
- launched a single docker based cluster and ran some simple HDFS commands
- build Spark master with 3.3.1 RC and : 1) run full Spark test suites and
all success; 2) tested simple Spark commands against a S3 endpoint; 3)
tested Spark on YARN with a simple example job.

Thanks Wei-Chiu for the great work!

On Thu, Jun 10, 2021 at 5:55 AM Masatake Iwasaki <
iwasak...@oss.nttdata.co.jp> wrote:

> +1
>
> Thanks for the great work, Wei-Chiu Chuang.
>
> * verified signature and checksum.
> * built site documentation by `mvn site` and skimmed the contents.
># found that top-level index.html is not updated.
> * built on CentOS 8 (x86_64) and OpenJDK 8 by `mvn install -DskipTests
> -Pnative -Pdist`.
>* launched pseudo cluster with security enabled and ran sample MR jobs.
>* launched 3-nodes cluster with NN-HA and RM-HA and ran sample MR jobs.
> * built on CentOS 7 (aarch64) and OpenJDK 8 by `mvn install -DskipTests
> -Pnative -Pdist`.
>
> * built Hive with the patch of HIVE-24484 against hadoop-3.3.1
>and ran TestMiniLlapCliDrivera (fixed by HDFS-15790).
>
> Masatake Iwasaki
>
> On 2021/06/01 19:29, Wei-Chiu Chuang wrote:
> > Hi community,
> >
> > This is the release candidate RC3 of Apache Hadoop 3.3.1 line. All
> blocker
> > issues have been resolved [1] again.
> >
> > There are 2 additional issues resolved for RC3:
> > * Revert "MAPREDUCE-7303. Fix TestJobResourceUploader failures after
> > HADOOP-16878
> > * Revert "HADOOP-16878. FileUtil.copy() to throw IOException if the
> source
> > and destination are the same
> >
> > There are 4 issues resolved for RC2:
> > * HADOOP-17666. Update LICENSE for 3.3.1
> > * MAPREDUCE-7348. TestFrameworkUploader#testNativeIO fails. (#3053)
> > * Revert "HADOOP-17563. Update Bouncy Castle to 1.68. (#2740)" (#3055)
> > * HADOOP-17739. Use hadoop-thirdparty 1.1.1. (#3064)
> >
> > The Hadoop-thirdparty 1.1.1, as previously mentioned, contains two extra
> > fixes compared to hadoop-thirdparty 1.1.0:
> > * HADOOP-17707. Remove jaeger document from site index.
> > * HADOOP-17730. Add back error_prone
> >
> > *RC tag is release-3.3.1-RC3
> > https://github.com/apache/hadoop/releases/tag/release-3.3.1-RC3
> >
> > *The RC3 artifacts are at*:
> > https://home.apache.org/~weichiu/hadoop-3.3.1-RC3/
> > ARM artifacts: https://home.apache.org/~weichiu/hadoop-3.3.1-RC3-arm/
> >
> > *The maven artifacts are hosted here:*
> > https://repository.apache.org/content/repositories/orgapachehadoop-1320/
> >
> > *My public key is available here:*
> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >
> >
> > Things I've verified:
> > * all blocker issues targeting 3.3.1 have been resolved.
> > * stable/evolving API changes between 3.3.0 and 3.3.1 are compatible.
> > * LICENSE and NOTICE files checked
> > * RELEASENOTES and CHANGELOG
> > * rat check passed.
> > * Built HBase master branch on top of Hadoop 3.3.1 RC2, ran unit tests.
> > * Built Ozone master on top fo Hadoop 3.3.1 RC2, ran unit tests.
> > * Extra: built 50 other open source projects on top of Hadoop 3.3.1 RC2.
> > Had to patch some of them due to commons-lang migration (Hadoop 3.2.0)
> and
> > dependency divergence. Issues are being identified but so far nothing
> > blocker for Hadoop itself.
> >
> > Please try the release and vote. The vote will run for 5 days.
> >
> > My +1 to start,
> >
> > [1] https://issues.apache.org/jira/issues/?filter=12350491
> > [2]
> >
> https://github.com/apache/hadoop/compare/release-3.3.1-RC1...release-3.3.1-RC3
> >
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64

2021-06-10 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/534/

[Jun 9, 2021 2:34:45 AM] (noreply) HDFS-15916. Addendum. DistCp: Backward 
compatibility: Distcp fails from Hadoop 3 to Hadoop 2 for snapshotdiff. (#3056)
[Jun 9, 2021 2:37:23 AM] (noreply) YARN-10809. Missing dependency causing 
NoClassDefFoundError in TestHBaseTimelineStorageUtils (#3081)
[Jun 9, 2021 5:24:10 AM] (noreply) HADOOP-17715 ABFS: Append blob tests with 
non HNS accounts fail (#3028)
[Jun 9, 2021 6:12:48 AM] (noreply) HDFS-16054. Replace Guava Lists usage by 
Hadoop's own Lists in hadoop-hdfs-project (#3073)
[Jun 9, 2021 6:15:47 AM] (noreply) YARN-10805. Replace Guava Lists usage by 
Hadoop's own Lists in hadoop-yarn-project (#3075)
[Jun 9, 2021 6:32:07 AM] (noreply) HADOOP-17750. Fix asf license errors in 
newly added files by HADOOP-17727. (#3083)




-1 overall


The following subsystems voted -1:
blanks pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml
 

Failed junit tests :

   hadoop.yarn.server.router.clientrm.TestFederationClientInterceptor 
   hadoop.yarn.csi.client.TestCsiClient 
  

   cc:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/534/artifact/out/results-compile-cc-root.txt
 [96K]

   javac:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/534/artifact/out/results-compile-javac-root.txt
 [380K]

   blanks:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/534/artifact/out/blanks-eol.txt
 [13M]
  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/534/artifact/out/blanks-tabs.txt
 [2.0M]

   checkstyle:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/534/artifact/out/results-checkstyle-root.txt
 [16M]

   pathlen:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/534/artifact/out/results-pathlen.txt
 [16K]

   pylint:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/534/artifact/out/results-pylint.txt
 [20K]

   shellcheck:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/534/artifact/out/results-shellcheck.txt
 [28K]

   xml:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/534/artifact/out/xml.txt
 [24K]

   javadoc:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/534/artifact/out/results-javadoc-javadoc-root.txt
 [408K]

   unit:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/534/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-router.txt
 [24K]
  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/534/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-csi.txt
 [20K]

Powered by Apache Yetus 0.14.0-SNAPSHOT   https://yetus.apache.org

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org