[jira] [Created] (HADOOP-17258) MagicS3GuardCommitter fails with `pendingset` already exists

2020-09-11 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created HADOOP-17258:
--

 Summary: MagicS3GuardCommitter fails with `pendingset` already 
exists
 Key: HADOOP-17258
 URL: https://issues.apache.org/jira/browse/HADOOP-17258
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/s3
Affects Versions: 3.2.0
Reporter: Dongjoon Hyun


`MagicS3GuardCommitter.innerCommitTask` has `false` at `pendingSet.save`.
{code}
try {
  pendingSet.save(getDestFS(), taskOutcomePath, false);
} catch (IOException e) {
  LOG.warn("Failed to save task commit data to {} ",
  taskOutcomePath, e);
  abortPendingUploads(context, pendingSet.getCommits(), true);
  throw e;
}
{code}

And, it can cause a job failure like the following.
{code}
WARN TaskSetManager: Lost task 1562.1 in stage 1.0 (TID 1788, 100.92.11.63, 
executor 26): org.apache.spark.SparkException: Task failed while writing rows.
at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:257)
at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:170)
at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:169)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at 
org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown 
Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown 
Source)
at java.base/java.lang.Thread.run(Unknown Source)
Caused by: org.apache.hadoop.fs.FileAlreadyExistsException: 
s3a://test-bucket-813987666268/dongjoon/t/__magic/app-attempt-/task_20200911063607_0001_m_001562.pendingset
 already exists
at org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:761)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1118)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1098)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:987)
at org.apache.hadoop.util.JsonSerialization.save(JsonSerialization.java:269)
at 
org.apache.hadoop.fs.s3a.commit.files.PendingSet.save(PendingSet.java:170)
at 
org.apache.hadoop.fs.s3a.commit.magic.MagicS3GuardCommitter.innerCommitTask(MagicS3GuardCommitter.java:220)
at 
org.apache.hadoop.fs.s3a.commit.magic.MagicS3GuardCommitter.commitTask(MagicS3GuardCommitter.java:165)
at 
org.apache.spark.mapred.SparkHadoopMapRedUtil$.performCommit$1(SparkHadoopMapRedUtil.scala:50)
at 
org.apache.spark.mapred.SparkHadoopMapRedUtil$.commitTask(SparkHadoopMapRedUtil.scala:77)
at 
org.apache.spark.internal.io.HadoopMapReduceCommitProtocol.commitTask(HadoopMapReduceCommitProtocol.scala:244)
at 
org.apache.spark.sql.execution.datasources.FileFormatDataWriter.commit(FileFormatDataWriter.scala:78)
at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:247)
at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:242)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: branch-2.10+JDK7 on Linux/x86_64

2020-09-11 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/53/

[Sep 10, 2020 11:37:32 AM] (noreply) HADOOP-17253. Upgrade zookeeper to 3.4.14 
on branch-2.10. (#2289)
[Sep 10, 2020 9:40:05 PM] (Jonathan Hung) YARN-8210. AMRMClient logging on 
every heartbeat to track updation of AM RM token causes too many log lines to 
be generated in AM logs. (Suma Shivaprasad via wangda)




-1 overall


The following subsystems voted -1:
asflicense hadolint jshint pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   hadoop-build-tools/src/main/resources/checkstyle/checkstyle.xml 
   hadoop-build-tools/src/main/resources/checkstyle/suppressions.xml 
   
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml
 
   hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml 
   hadoop-tools/hadoop-azure/src/config/checkstyle.xml 
   hadoop-tools/hadoop-resourceestimator/src/config/checkstyle.xml 
   hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml
 

Failed junit tests :

   hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   hadoop.contrib.bkjournal.TestBookKeeperJournalManager 
   
hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain 
   hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys 
   hadoop.hdfs.server.namenode.ha.TestBootstrapStandby 
   hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   hadoop.contrib.bkjournal.TestBookKeeperJournalManager 
   hadoop.hdfs.server.federation.router.TestRouterNamenodeHeartbeat 
   hadoop.hdfs.server.federation.resolver.order.TestLocalResolver 
   hadoop.hdfs.server.federation.router.TestRouterQuota 
   hadoop.hdfs.server.federation.resolver.TestMultipleDestinationResolver 
   hadoop.yarn.server.resourcemanager.scheduler.fair.TestAppRunnability 
   hadoop.yarn.server.resourcemanager.rmapp.TestApplicationLifetimeMonitor 
   hadoop.yarn.server.resourcemanager.TestClientRMService 
   hadoop.mapreduce.jobhistory.TestHistoryViewerPrinter 
   hadoop.resourceestimator.service.TestResourceEstimatorService 
   hadoop.resourceestimator.solver.impl.TestLpSolver 
  

   jshint:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/53/artifact/out/diff-patch-jshint.txt
  [208K]

   cc:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/53/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/53/artifact/out/diff-compile-javac-root.txt
  [456K]

   checkstyle:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/53/artifact/out/diff-checkstyle-root.txt
  [16M]

   hadolint:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/53/artifact/out/diff-patch-hadolint.txt
  [4.0K]

   pathlen:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/53/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/53/artifact/out/diff-patch-pylint.txt
  [60K]

   shellcheck:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/53/artifact/out/diff-patch-shellcheck.txt
  [56K]

   shelldocs:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/53/artifact/out/diff-patch-shelldocs.txt
  [8.0K]

   whitespace:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/53/artifact/out/whitespace-eol.txt
  [12M]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/53/artifact/out/whitespace-tabs.txt
  [1.3M]

   xml:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/53/artifact/out/xml.txt
  [4.0K]

   javadoc:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/53/artifact/out/diff-javadoc-javadoc-root.txt
  [20K]

   unit:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/53/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [276K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/53/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs_src_contrib_bkjournal.txt
  [20K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/53/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
  [36K]
  

[jira] [Reopened] (HADOOP-17244) HADOOP-17244. S3A directory delete tombstones dir markers prematurely.

2020-09-11 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reopened HADOOP-17244:
-

seeing failures with this on cli but not IDE
{code}
[INFO] Running org.apache.hadoop.fs.s3a.performance.ITestS3ADeleteCost
[ERROR] Tests run: 16, Failures: 2, Errors: 0, Skipped: 2, Time elapsed: 78.855 
s <<< FAILURE! - in org.apache.hadoop.fs.s3a.performance.ITestS3ADeleteCost
[ERROR] 
testDirMarkersSubdir[raw-delete-markers](org.apache.hadoop.fs.s3a.performance.ITestS3ADeleteCost)
  Time elapsed: 6.48 s  <<< FAILURE!
java.lang.AssertionError: operation returning true: object_delete_requests 
expected:<1> but was:<2>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at 
org.apache.hadoop.fs.s3a.S3ATestUtils$MetricDiff.assertDiffEquals(S3ATestUtils.java:960)
at 
org.apache.hadoop.fs.s3a.performance.OperationCostValidator$ExpectSingleStatistic.verify(OperationCostValidator.java:379)
at 
org.apache.hadoop.fs.s3a.performance.OperationCostValidator.exec(OperationCostValidator.java:153)
at 
org.apache.hadoop.fs.s3a.performance.AbstractS3ACostTest.verifyMetrics(AbstractS3ACostTest.java:331)
at 
org.apache.hadoop.fs.s3a.performance.ITestS3ADeleteCost.testDirMarkersSubdir(ITestS3ADeleteCost.java:204)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)

[ERROR] 
testDirMarkersSubdir[auth-delete-markers](org.apache.hadoop.fs.s3a.performance.ITestS3ADeleteCost)
  Time elapsed: 3.028 s  <<< FAILURE!
java.lang.AssertionError: operation returning true: object_delete_requests 
expected:<1> but was:<2>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at 
org.apache.hadoop.fs.s3a.S3ATestUtils$MetricDiff.assertDiffEquals(S3ATestUtils.java:960)
at 
org.apache.hadoop.fs.s3a.performance.OperationCostValidator$ExpectSingleStatistic.verify(OperationCostValidator.java:379)
at 
org.apache.hadoop.fs.s3a.performance.OperationCostValidator.exec(OperationCostValidator.java:153)
at 
org.apache.hadoop.fs.s3a.performance.AbstractS3ACostTest.verifyMetrics(AbstractS3ACostTest.java:331)
at 
org.apache.hadoop.fs.s3a.performance.ITestS3ADeleteCost.testDirMarkersSubdir(ITestS3ADeleteCost.java:204)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64

2020-09-11 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/262/

[Sep 10, 2020 10:25:23 AM] (noreply) HDFS-15563. Incorrect getTrashRoot return 
value when a non-snapshottable dir prefix matches the path of a snapshottable 
dir (#2295)
[Sep 10, 2020 4:03:52 PM] (noreply) HADOOP-17244. S3A directory delete 
tombstones dir markers prematurely. (#2280)




-1 overall


The following subsystems voted -1:
asflicense mvnsite pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml
 

Failed junit tests :

   hadoop.hdfs.TestFileChecksum 
   hadoop.hdfs.TestRollingUpgrade 
   hadoop.hdfs.TestFileChecksumCompositeCrc 
   hadoop.hdfs.TestDFSOutputStream 
   hadoop.hdfs.server.namenode.ha.TestBootstrapAliasmap 
   hadoop.yarn.server.nodemanager.amrmproxy.TestFederationInterceptor 
   
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption 
   hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer 
   hadoop.yarn.applications.distributedshell.TestDistributedShell 
   hadoop.yarn.sls.appmaster.TestAMSimulator 
  

   cc:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/262/artifact/out/diff-compile-cc-root.txt
  [48K]

   javac:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/262/artifact/out/diff-compile-javac-root.txt
  [568K]

   checkstyle:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/262/artifact/out/diff-checkstyle-root.txt
  [16M]

   mvnsite:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/262/artifact/out/patch-mvnsite-root.txt
  [484K]

   pathlen:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/262/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/262/artifact/out/diff-patch-pylint.txt
  [60K]

   shellcheck:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/262/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/262/artifact/out/diff-patch-shelldocs.txt
  [96K]

   whitespace:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/262/artifact/out/whitespace-eol.txt
  [13M]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/262/artifact/out/whitespace-tabs.txt
  [1.9M]

   xml:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/262/artifact/out/xml.txt
  [24K]

   javadoc:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/262/artifact/out/diff-javadoc-javadoc-root.txt
  [1.3M]

   unit:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/262/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [432K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/262/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
  [68K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/262/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
  [108K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/262/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-applications-distributedshell.txt
  [16K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/262/artifact/out/patch-unit-hadoop-tools_hadoop-sls.txt
  [12K]

   asflicense:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/262/artifact/out/patch-asflicense-problems.txt
  [4.0K]

P

[jira] [Created] (HADOOP-17259) SSLFactory should fallback to input config if ssl-*.xml fail to load from classpath

2020-09-11 Thread Xiaoyu Yao (Jira)
Xiaoyu Yao created HADOOP-17259:
---

 Summary: SSLFactory should fallback to input config if ssl-*.xml 
fail to load from classpath
 Key: HADOOP-17259
 URL: https://issues.apache.org/jira/browse/HADOOP-17259
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 2.8.5
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao


Some applications like Tez does not have ssl-client.xml and ssl-server.xml in 
classpath. Instead, it directly pass the parsed SSL configuration as the input 
configuration object. This ticket is opened to allow this case. TEZ-4096 
attempts to solve this issue but but take a different approach which may not 
work in existing Hadoop clients that use SSLFactory from hadoop-common. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org