[jira] [Created] (HDFS-15215) The Timestamp for longest write/read lock held log is wrong

2020-03-10 Thread Toshihiro Suzuki (Jira)
Toshihiro Suzuki created HDFS-15215:
---

 Summary: The Timestamp for longest write/read lock held log is 
wrong
 Key: HDFS-15215
 URL: https://issues.apache.org/jira/browse/HDFS-15215
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Toshihiro Suzuki
Assignee: Toshihiro Suzuki


I found the Timestamp for longest write/read lock held log is wrong in trunk:

{code}
2020-03-10 16:01:26,585 [main] INFO  namenode.FSNamesystem 
(FSNamesystemLock.java:writeUnlock(281)) -   Number of suppressed write-lock 
reports: 0
Longest write-lock held at 1970-01-03 07:07:40,841+0900 for 3ms via 
java.lang.Thread.getStackTrace(Thread.java:1559)
...
{code}

Looking at the code, it looks like the timestamp comes from System.nanoTime() 
that returns the current value of the running Java Virtual Machine's 
high-resolution time source and this method can only be used to measure elapsed 
time:
https://docs.oracle.com/javase/8/docs/api/java/lang/System.html#nanoTime--

We need to make the timestamp from System.currentTimeMillis().





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-15216) Wrong Use Case of -showprogress in fsck

2020-03-10 Thread Ravuri Sushma sree (Jira)
Ravuri Sushma sree created HDFS-15216:
-

 Summary: Wrong Use Case of -showprogress in fsck 
 Key: HDFS-15216
 URL: https://issues.apache.org/jira/browse/HDFS-15216
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ravuri Sushma sree


*-showprogress* is deprecated and Progress is now shown by default but fsck 
--help shows incorrect use case for the same 

 

Usage: hdfs fsck  [-list-corruptfileblocks | [-move | -delete | 
-openforwrite] [-files [-blocks [-locations | -racks | -replicaDetails | 
-upgradedomains [-includeSnapshots] [-showprogress] [-storagepolicies] 
[-maintenance] [-blockId ]
  start checking from this path
h4. 
 -move move corrupted files to /lost+found
 -delete delete corrupted files
 -files print out files being checked
 -openforwrite print out files opened for write
 -includeSnapshots include snapshot data if the given path indicates a 
snapshottable directory or there are snapshottable directories under it
 -list-corruptfileblocks print out list of missing blocks and files they belong 
to
 -files -blocks print out block report
 -files -blocks -locations print out locations for every block
 -files -blocks -racks print out network topology for data-node locations
 -files -blocks -replicaDetails print out each replica details
 -files -blocks -upgradedomains print out upgrade domains for every block
 -storagepolicies print out storage policy summary for the blocks
 -maintenance print out maintenance state node details
 *-showprogress show progress in output. Default is OFF (no progress)*
 -blockId print out which file this blockId belongs to, locations (nodes, 
racks) of this block, and other diagnostics info (under replicated, corrupted 
or not, etc)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: branch2.10+JDK7 on Linux/x86

2020-03-10 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/620/

[Mar 9, 2020 5:03:32 AM] (wwei) HADOOP-16840. AliyunOSS: getFileStatus throws 
FileNotFoundException in




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml
 
   hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml 
   hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml
 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client
 
   Boxed value is unboxed and then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:[line 335] 

Failed junit tests :

   hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys 
   hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes 
   hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   hadoop.registry.secure.TestSecureLogins 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/620/artifact/out/diff-compile-cc-root-jdk1.7.0_95.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/620/artifact/out/diff-compile-javac-root-jdk1.7.0_95.txt
  [328K]

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/620/artifact/out/diff-compile-cc-root-jdk1.8.0_242.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/620/artifact/out/diff-compile-javac-root-jdk1.8.0_242.txt
  [308K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/620/artifact/out/diff-checkstyle-root.txt
  [16M]

   hadolint:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/620/artifact/out/diff-patch-hadolint.txt
  [4.0K]

   pathlen:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/620/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/620/artifact/out/diff-patch-pylint.txt
  [24K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/620/artifact/out/diff-patch-shellcheck.txt
  [56K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/620/artifact/out/diff-patch-shelldocs.txt
  [8.0K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/620/artifact/out/whitespace-eol.txt
  [12M]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/620/artifact/out/whitespace-tabs.txt
  [1.3M]

   xml:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/620/artifact/out/xml.txt
  [12K]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/620/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-hbase_hadoop-yarn-server-timelineservice-hbase-client-warnings.html
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/620/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-common.txt
  [0]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/620/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-shuffle.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/620/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt
  [0]
   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/620/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-hs.txt
  [4.0K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/620/artifact/out/diff-javadoc

[jira] [Created] (HDFS-15217) Add more information to longest write/read lock held log

2020-03-10 Thread Toshihiro Suzuki (Jira)
Toshihiro Suzuki created HDFS-15217:
---

 Summary: Add more information to longest write/read lock held log
 Key: HDFS-15217
 URL: https://issues.apache.org/jira/browse/HDFS-15217
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Toshihiro Suzuki
Assignee: Toshihiro Suzuki


Currently, we can see the stack trace in the longest write/read lock held log, 
but sometimes we need more information, for example, a target path of deletion:
{code:java}
2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
(FSNamesystemLock.java:writeUnlock(276)) -   Number of suppressed write-lock 
reports: 0
Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
java.lang.Thread.getStackTrace(Thread.java:1559)
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
...
{code}
Adding more information (opName, path, etc.) to the log is useful to 
troubleshoot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-15218) RBF : MountTableRefresherService fail in secure cluster.

2020-03-10 Thread Surendra Singh Lilhore (Jira)
Surendra Singh Lilhore created HDFS-15218:
-

 Summary: RBF : MountTableRefresherService fail in secure cluster.
 Key: HDFS-15218
 URL: https://issues.apache.org/jira/browse/HDFS-15218
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: rbf
Affects Versions: 3.1.1
Reporter: Surendra Singh Lilhore
Assignee: Surendra Singh Lilhore


{code:java}
2020-03-09 12:43:50,082 | ERROR | MountTableRefresh_linux-133:25020 | Failed to 
refresh mount table entries cache at router X:25020 | 
MountTableRefresherThread.java:69
java.io.IOException: DestHost:destPort X:25020 , LocalHost:localPort 
XXX/XXX:0. Failed on local exception: java.io.IOException: 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]
at 
org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolTranslatorPB.refreshMountTableEntries(RouterAdminProtocolTranslatorPB.java:284)
at 
org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread.run(MountTableRefresherThread.java:65)
 {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2020-03-10 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1434/

[Mar 9, 2020 4:01:34 AM] (wwei) HADOOP-16840. AliyunOSS: getFileStatus throws 
FileNotFoundException in
[Mar 9, 2020 1:51:58 PM] (brahma) HADOOP-16871. Upgrade Netty version to 
4.1.45.Final to handle
[Mar 9, 2020 2:37:08 PM] (github) HADOOP-16909 Typo in distcp counters.
[Mar 9, 2020 2:44:28 PM] (stevel) HADOOP-14630 Contract Tests to verify create, 
mkdirs and rename under a
[Mar 9, 2020 2:51:16 PM] (stevel) HADOOP-16898. Batch listing of multiple 
directories via an (unstable)
[Mar 9, 2020 3:08:24 PM] (snemeth) YARN-9419. Log a warning if GPU isolation is 
enabled but




-1 overall


The following subsystems voted -1:
asflicense findbugs pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml
 

FindBugs :

   module:hadoop-cloud-storage-project/hadoop-cos 
   Redundant nullcheck of dir, which is known to be non-null in 
org.apache.hadoop.fs.cosn.BufferPool.createDir(String) Redundant null check at 
BufferPool.java:is known to be non-null in 
org.apache.hadoop.fs.cosn.BufferPool.createDir(String) Redundant null check at 
BufferPool.java:[line 66] 
   org.apache.hadoop.fs.cosn.CosNInputStream$ReadBuffer.getBuffer() may 
expose internal representation by returning CosNInputStream$ReadBuffer.buffer 
At CosNInputStream.java:by returning CosNInputStream$ReadBuffer.buffer At 
CosNInputStream.java:[line 87] 
   Found reliance on default encoding in 
org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFile(String, File, 
byte[]):in org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFile(String, 
File, byte[]): new String(byte[]) At CosNativeFileSystemStore.java:[line 199] 
   Found reliance on default encoding in 
org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFileWithRetry(String, 
InputStream, byte[], long):in 
org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFileWithRetry(String, 
InputStream, byte[], long): new String(byte[]) At 
CosNativeFileSystemStore.java:[line 178] 
   org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.uploadPart(File, 
String, String, int) may fail to clean up java.io.InputStream Obligation to 
clean up resource created at CosNativeFileSystemStore.java:fail to clean up 
java.io.InputStream Obligation to clean up resource created at 
CosNativeFileSystemStore.java:[line 252] is not discharged 

Failed junit tests :

   hadoop.hdfs.server.datanode.TestDataNodeUUID 
   hadoop.yarn.applications.distributedshell.TestDistributedShell 
   hadoop.yarn.sls.appmaster.TestAMSimulator 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1434/artifact/out/diff-compile-cc-root.txt
  [8.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1434/artifact/out/diff-compile-javac-root.txt
  [424K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1434/artifact/out/diff-checkstyle-root.txt
  [16M]

   pathlen:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1434/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1434/artifact/out/diff-patch-pylint.txt
  [24K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1434/artifact/out/diff-patch-shellcheck.txt
  [16K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1434/artifact/out/diff-patch-shelldocs.txt
  [44K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1434/artifact/out/whitespace-eol.txt
  [9.9M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1434/artifact/out/whitespace-tabs.txt
  [1.1M]

   xml:

   
https://builds.apache.org/job/had

Re: [DISCUSS] Accelerate Hadoop dependency updates

2020-03-10 Thread Wei-Chiu Chuang
I'm not hearing any feedback so far, but I want to suggest:

use hadoop-thirdparty repository to host any dependencies that are known to
break compatibility.

Candidate #1 guava
Candidate #2 Netty
Candidate #3 Jetty

in fact, HBase shades these dependencies for the exact same reason.

As an example of the cost of compatibility breakage: we spent the last 6
months to backport the guava update change (guava 11 --> 27) throughout
Cloudera's stack, and after 6 months we are not done yet because we have to
update guava in Hadoop, Hive, Spark ..., and Hadoop, Hive and Spark's guava
is in the classpath of every application.

Thoughts?

On Sat, Mar 7, 2020 at 9:31 AM Wei-Chiu Chuang  wrote:

> Hi Hadoop devs,
>
> I the past, Hadoop tends to be pretty far behind the latest versions of
> dependencies. Part of that is due to the fear of the breaking changes
> brought in by the dependency updates.
>
> However, things have changed dramatically over the past few years. With
> more focus on security vulnerabilities, more vulnerabilities are discovered
> in our dependencies, and users put more pressure on patching Hadoop (and
> its ecosystem) to use the latest dependency versions.
>
> As an example, Jackson-databind had 20 CVEs published in the last year
> alone.
> https://www.cvedetails.com/product/42991/Fasterxml-Jackson-databind.html?vendor_id=15866
>
> Jetty: 4 CVEs in 2019:
> https://www.cvedetails.com/product/34824/Eclipse-Jetty.html?vendor_id=10410
>
> We can no longer keep Hadoop stay behind. The more we stay behind, the
> harder it is to update. A good example is Jersey migration 1 -> 2
> HADOOP-15984  contributed
> by Akira. Jersey 1 is no longer supported. But Jersey 2 migration is hard.
> If any critical vulnerability is found in Jersey 1, it will leave us in a
> bad situation since we can't simply update Jersey version and be done.
>
> Hadoop 3 adds new public artifacts that shade these dependencies. We
> should advocate downstream applications to use the public artifacts to
> avoid breakage.
>
> I'd like to hear your thoughts: are you okay to see Hadoop keep up with
> the latest dependency updates, or would rather stay behind to ensure
> compatibility?
>
> Coupled with that, I'd like to call for more frequent Hadoop releases for
> the same purpose. IMHO that'll require better infrastructure to assist the
> release work and some rethinking our current Hadoop code structure, like
> separate each subproject into its own repository and release cadence. This
> can be controversial but I think it'll be good for the project in the long
> run.
>
> Thanks,
> Wei-Chiu
>


[NOTICE] branch HADOOP-15566-OpenTracing created

2020-03-10 Thread Wei-Chiu Chuang
Hi devs,

I forked a branch HADOOP-15566-OpenTracing from trunk. This branch is used
to develop HADOOP-15566 (OpenTracing support in Hadoop)


This week's APAC Hadoop storage community online sync

2020-03-10 Thread Wei-Chiu Chuang
Hi!

Gentle reminder: Tomorrow's the APAC Hadoop storage community online sync.

Date/time:
March 12th 1PM (China) / 2PM (Japan) / 10:30AM (India)
March 11th 10PM (US West Coast)

Zoom link: https://cloudera.zoom.us/j/880548968

Past meeting minutes:
https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit?usp=sharing

See you tomorrow!