[jira] [Resolved] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-07 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-17125.
-
Fix Version/s: 3.3.1
   Resolution: Fixed

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Assignee: DB Tsai
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1
>
>  Time Spent: 24h 40m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: branch-2.10+JDK7 on Linux/x86_64

2020-10-07 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/79/

[Oct 6, 2020 6:46:08 PM] (Jim Brennan) YARN-10451. RM (v1) UI NodesPage can NPE 
when yarn.io/gpu resource type is defined. Contributed by Eric Payne




-1 overall


The following subsystems voted -1:
asflicense hadolint jshint pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   hadoop-build-tools/src/main/resources/checkstyle/checkstyle.xml 
   hadoop-build-tools/src/main/resources/checkstyle/suppressions.xml 
   
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml
 
   hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml 
   hadoop-tools/hadoop-azure/src/config/checkstyle.xml 
   hadoop-tools/hadoop-resourceestimator/src/config/checkstyle.xml 
   hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml
 

Failed junit tests :

   hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   
hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain 
   hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys 
   hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   hadoop.hdfs.server.federation.router.TestRouterNamenodeHeartbeat 
   hadoop.hdfs.server.federation.resolver.order.TestLocalResolver 
   hadoop.hdfs.server.federation.router.TestRouterQuota 
   hadoop.hdfs.server.federation.resolver.TestMultipleDestinationResolver 
   hadoop.yarn.server.resourcemanager.TestClientRMService 
   hadoop.yarn.applications.distributedshell.TestDistributedShell 
   hadoop.mapreduce.jobhistory.TestHistoryViewerPrinter 
   hadoop.tools.TestDistCpSystem 
   hadoop.resourceestimator.service.TestResourceEstimatorService 
   hadoop.resourceestimator.solver.impl.TestLpSolver 
  

   jshint:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/79/artifact/out/diff-patch-jshint.txt
  [208K]

   cc:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/79/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/79/artifact/out/diff-compile-javac-root.txt
  [456K]

   checkstyle:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/79/artifact/out/diff-checkstyle-root.txt
  [16M]

   hadolint:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/79/artifact/out/diff-patch-hadolint.txt
  [4.0K]

   pathlen:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/79/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/79/artifact/out/diff-patch-pylint.txt
  [60K]

   shellcheck:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/79/artifact/out/diff-patch-shellcheck.txt
  [56K]

   shelldocs:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/79/artifact/out/diff-patch-shelldocs.txt
  [8.0K]

   whitespace:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/79/artifact/out/whitespace-eol.txt
  [12M]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/79/artifact/out/whitespace-tabs.txt
  [1.3M]

   xml:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/79/artifact/out/xml.txt
  [4.0K]

   javadoc:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/79/artifact/out/diff-javadoc-javadoc-root.txt
  [20K]

   unit:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/79/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [272K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/79/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs_src_contrib_bkjournal.txt
  [12K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/79/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
  [36K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/79/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
  [120K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/79/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservi

[jira] [Resolved] (HADOOP-17281) Implement FileSystem.listStatusIterator() in S3AFileSystem

2020-10-07 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-17281.
-
Fix Version/s: 3.3.1
   Resolution: Fixed

merged into 3.1+; looking forward to this. The HADOOP-16380 stats code will 
need to be wired up to this; I'm not doing it *yet* as a I don't want to rebase 
everything there

> Implement FileSystem.listStatusIterator() in S3AFileSystem
> --
>
> Key: HADOOP-17281
> URL: https://issues.apache.org/jira/browse/HADOOP-17281
> Project: Hadoop Common
>  Issue Type: Task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Mukund Thakur
>Assignee: Mukund Thakur
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Currently S3AFileSystem only implements listStatus() api which returns an 
> array. Once we implement the listStatusIterator(), clients can benefit from 
> the async listing done recently 
> https://issues.apache.org/jira/browse/HADOOP-17074  by performing some tasks 
> on files while iterating them.
>  
> CC [~stevel]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-17300) FileSystem.DirListingIterator.next() call should return NoSuchElementException

2020-10-07 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-17300.
-
Fix Version/s: 3.3.1
   Resolution: Fixed

Fixed in HADOOP-17281, because the changes to the contract tests there found 
the bug

> FileSystem.DirListingIterator.next() call should return NoSuchElementException
> --
>
> Key: HADOOP-17300
> URL: https://issues.apache.org/jira/browse/HADOOP-17300
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common, fs
>Reporter: Mukund Thakur
>Assignee: Mukund Thakur
>Priority: Major
> Fix For: 3.3.1
>
>
> FileSystem.DirListingIterator.next() call should return 
> NoSuchElementException rather than IllegalStateException
>  
> Stacktrace for new test failure:
>  
> {code:java}
> java.lang.IllegalStateException: No more items in 
> iteratorjava.lang.IllegalStateException: No more items in iterator at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:507) at 
> org.apache.hadoop.fs.FileSystem$DirListingIterator.next(FileSystem.java:2232) 
> at 
> org.apache.hadoop.fs.FileSystem$DirListingIterator.next(FileSystem.java:2205) 
> at 
> org.apache.hadoop.fs.contract.ContractTestUtils.iteratorToListThroughNextCallsAlone(ContractTestUtils.java:1495)
>  at 
> org.apache.hadoop.fs.contract.AbstractContractGetFileStatusTest.testListStatusIteratorFile(AbstractContractGetFileStatusTest.java:366)
> {code}
>  
> CC [~ste...@apache.org]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



reviewers needed: HADOOP-16830 Add public IOStatistics API

2020-10-07 Thread Steve Loughran
Hi,

Can I get some reviews of this PR
https://github.com/apache/hadoop/pull/2323

It adds a new API, IOStatisticsSource, for any class to act as a source of
a static or dynamic IOStatistics set of counters/gauges/min/max/mean stats

The intent is to allow applications to collect statistics on streams,
iterators, and other classes they use to interact with filesystems/remote
stores, so get detailed statistics on the #of operations, latencies etc.
There's help to log these results, as well as aggregate them


Here's the API specifications

https://github.com/steveloughran/hadoop/blob/s3/HADOOP-16830-iostatistics-common/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/iostatistics.md

The FSDataStreams do passthrough of this, and there's a set of remote
iterators which also do passthrough, making it easy to chain/wrap
iteration code.
https://github.com/steveloughran/hadoop/blob/s3/HADOOP-16830-iostatistics-common/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/functional/RemoteIterators.java

It also includes a statistics snapshot which can be serialized as JSON and
java objects, and aggregate results
https://github.com/steveloughran/hadoop/blob/s3/HADOOP-16830-iostatistics-common/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/IOStatisticsSnapshot.java

This is how applications can aggregate results, and then propagate it back
to the AM/job driver/query engine

We already have PRs using this for S3A and ABFS on input streams, and in
S3A we also count LIST performance, which clients can pick up provided they
use the listStatusIterator, listFiles etc calls which return RemoteIterator.

I know it's a lot of code, but it's split into interface and
implementation, the public interface is for applications, the
implementation is what we are using internally, and which we will tune as
we adopt it more.

I have been working on this on and off for months, and yes it has grown.
But now that we are supporting more complex storage systems, the existing
tracking of long/short reads isn't informative enough. I want to know how
many GET requests failed and had to be retried, how often the DELETE calls
were throttled, and what the real latency of list operations are over
long-haul connections.

Please, take a look. As a new API it's unlikely to cause any regressions
-the main things to worry about are "is that API the one applications can
use" and "hi Steve got something fundamentally wrong in his implementation
code?"

-Steve


Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64

2020-10-07 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/288/

[Oct 6, 2020 4:07:54 PM] (noreply) HADOOP-17125. Use snappy-java in SnappyCodec 
(#2297)
[Oct 6, 2020 6:18:08 PM] (Jim Brennan) YARN-10451. RM (v1) UI NodesPage can NPE 
when yarn.io/gpu resource type is defined. Contributed by Eric Payne
[Oct 6, 2020 9:58:42 PM] (Wei-Chiu Chuang) HADOOP-16990. Update Mockserver. 
Contributed by Attila Doroszlai.




-1 overall


The following subsystems voted -1:
asflicense compile golang pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml
 

Failed junit tests :

   hadoop.crypto.key.kms.server.TestKMS 
   hadoop.hdfs.TestFileChecksum 
   hadoop.hdfs.TestReadStripedFileWithMissingBlocks 
   hadoop.hdfs.TestDFSShell 
   hadoop.hdfs.TestFileChecksumCompositeCrc 
   hadoop.hdfs.TestGetFileChecksum 
   hadoop.hdfs.server.datanode.TestBPOfferService 
   hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks 
   hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped 
   hadoop.hdfs.server.namenode.TestFileTruncate 
   hadoop.hdfs.server.namenode.TestNamenodeStorageDirectives 
   hadoop.hdfs.TestDecommissionWithStriped 
   hadoop.fs.contract.router.web.TestRouterWebHDFSContractAppend 
   
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerOvercommit 
   
hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher 
   hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer 
   hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 
   hadoop.yarn.server.router.webapp.TestRouterWebServicesREST 
   hadoop.yarn.applications.distributedshell.TestDistributedShell 
   hadoop.mapreduce.v2.app.rm.TestRMCommunicator 
   hadoop.tools.TestDistCpSystem 
   hadoop.yarn.sls.TestReservationSystemInvariants 
  

   compile:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/288/artifact/out/patch-compile-root.txt
  [500K]

   cc:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/288/artifact/out/patch-compile-root.txt
  [500K]

   golang:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/288/artifact/out/patch-compile-root.txt
  [500K]

   javac:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/288/artifact/out/patch-compile-root.txt
  [500K]

   checkstyle:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/288/artifact/out/diff-checkstyle-root.txt
  [16M]

   pathlen:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/288/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/288/artifact/out/diff-patch-pylint.txt
  [60K]

   shellcheck:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/288/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/288/artifact/out/diff-patch-shelldocs.txt
  [44K]

   whitespace:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/288/artifact/out/whitespace-eol.txt
  [13M]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/288/artifact/out/whitespace-tabs.txt
  [1.9M]

   xml:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/288/artifact/out/xml.txt
  [24K]

   javadoc:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/288/artifact/out/diff-javadoc-javadoc-root.txt
  [1.3M]

   unit:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/288/artifact/out/patch-unit-hadoop-common-project_hadoop

[jira] [Created] (HADOOP-17301) ABFS: Fix bug introduced in HADOOP-16852 which reports read-ahead error back

2020-10-07 Thread Sneha Vijayarajan (Jira)
Sneha Vijayarajan created HADOOP-17301:
--

 Summary: ABFS: Fix bug introduced in HADOOP-16852 which reports 
read-ahead error back
 Key: HADOOP-17301
 URL: https://issues.apache.org/jira/browse/HADOOP-17301
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/azure
Affects Versions: 3.3.0
Reporter: Sneha Vijayarajan
Assignee: Sneha Vijayarajan


When reads done by readahead buffers failed, the exceptions where dropped and 
the failure was not getting reported to the calling app. 

Jira HADOOP-16852: Report read-ahead error back

tried to handle the scenario by reporting the error back to calling app. But 
the commit has introduced a bug which can lead to ReadBuffer being injected 
into read completed queue twice. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org