Pinged Bobby who has access to the machines, and this is indeed what is
happening. There were two cases:
1) When TestUberAM times out the minicluster it is running along with
its MRAppMaster process can escape. There are a ton of threads in those
processes, so it doesn't take very many of these to leak before the
process ulimit is hit.
2) There were a couple of other surefire processes that had leaked but
they had no discernable state left that would identify which test it was
running other than it was something inside of mapreduce-client-jobclient
(which could still be TestUberAM). The main thread and most other
non-daemon threads were gone, but there was a lone SocketReader thread
that was still hanging around. It wasn't a daemon thread and was
apparently the only thread keeping the JVM alive.
So we need to prioritize fixing the TestUberAM hang, currently tracked
by MAPREDUCE-5481 <https://issues.apache.org/jira/browse/MAPREDUCE-5481>
and/or find a way to keep it from escaping during builds. There might
be another issue where SocketReader threads can prevent the JVM from
shutting down completely in some cases.
Jason
On 10/31/2013 08:19 AM, Jason Lowe wrote:
I don't think that OOM error below indicates it needs more heap space,
as it's complaining about the ability to create a new native thread.
That usually is caused by lack of available virtual address space or
hitting process ulimits.
What's most likely going on is the jenkins user is hitting a process
ulimit. This can occur if processes have "leaked" from previous
build/test runs and are using a large number of threads, or a large
number of processes have leaked overall. Could someone with access to
the build machines check if that is indeed the case? If it has, bonus
points for indentifying the source of the leak. ;-)
Thanks!
Jason
On 10/30/2013 05:39 PM, Roman Shaposhnik wrote:
I can take a look sometime later today. Meantime I can only
say that I've been running into 1Gb limit in a few builds as
of late. These days -- I just go with 2G by default.
Thanks,
Roman.
On Wed, Oct 30, 2013 at 3:33 PM, Alejandro Abdelnur
<t...@cloudera.com> wrote:
The following is happening in builds for MAPREDUCE and YARN patches.
I've seen the failures in hadoop5 and hadoop7 machines. I've increased
Maven memory to 1GB (export MAVEN_OPTS="-Xmx1024m" in the jenkins
jobs) but still some failures persist:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4159/
Does anybody has an idea of what may be going on?
thx
[INFO] --- native-maven-plugin:1.0-alpha-7:javah (default) @
hadoop-common ---
[INFO] /bin/sh -c cd
/home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-common-project/hadoop-common
&& /home/jenkins/tools/java/latest/bin/javah -d
/home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-common-project/hadoop-common/target/native/javah
-classpath
/home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-common-project/hadoop-common/target/classes:/home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-common-project/hadoop-annotations/target/classes:/home/jenkins/tools/java/jdk1.6.0_26/jre/../lib/tools.jar:/home/jenkins/.m2/repository/com/google/guava/guava/11.0.2/guava-11.0.2.jar:/home/jenkins/.m2/repository/com/google/code/findbugs/jsr305/1.3.9/jsr305-1.3.9.jar:/home/jenkins/.m2/repository/commons-cli/commons-cli/1.2/commons-cli-1.2.jar:/home/jenkins/.m2/repository/org/apache/commons/commons-math/2.1/commons-math-2.1.jar:/home/jenkins/.m2/repository/xmlenc/xmlenc/0.52/xmlenc-0.52.jar:/home/jenkins/.m2/repository/commons-httpclient/commons-httpclient/3.1/commons-httpclient-3.1.jar:/home/jenkins/.m2/repository/commons-codec/commons-codec/1.4/commons-codec-1.4.jar:/home/jenkins/.m2/repository/commons-io/commons-io/2.1/commons-io-2.1.jar:/home/jenkins/.m2/repository/commons-net/commons-net/3.1/commons-net-3.1.jar:/home/jenkins/.m2/repository/javax/servlet/servlet-api/2.5/servlet-api-2.5.jar:/home/jenkins/.m2/repository/org/mortbay/jetty/jetty/6.1.26/jetty-6.1.26.jar:/home/jenkins/.m2/repository/org/mortbay/jetty/jetty-util/6.1.26/jetty-util-6.1.26.jar:/home/jenkins/.m2/repository/com/sun/jersey/jersey-core/1.9/jersey-core-1.9.jar:/home/jenkins/.m2/repository/com/sun/jersey/jersey-json/1.9/jersey-json-1.9.jar:/home/jenkins/.m2/repository/org/codehaus/jettison/jettison/1.1/jettison-1.1.jar:/home/jenkins/.m2/repository/stax/stax-api/1.0.1/stax-api-1.0.1.jar:/home/jenkins/.m2/repository/com/sun/xml/bind/jaxb-impl/2.2.3-1/jaxb-impl-2.2.3-1.jar:/home/jenkins/.m2/repository/javax/xml/bind/jaxb-api/2.2.2/jaxb-api-2.2.2.jar:/home/jenkins/.m2/repository/javax/activation/activation/1.1/activation-1.1.jar:/home/jenkins/.m2/repository/org/codehaus/jackson/jackson-jaxrs/1.8.8/jackson-jaxrs-1.8.8.jar:/home/jenkins/.m2/repository/org/codehaus/jackson/jackson-xc/1.8.8/jackson-xc-1.8.8.jar:/home/jenkins/.m2/repository/com/sun/jersey/jersey-server/1.9/jersey-server-1.9.jar:/home/jenkins/.m2/repository/asm/asm/3.2/asm-3.2.jar:/home/jenkins/.m2/repository/commons-logging/commons-logging/1.1.1/commons-logging-1.1.1.jar:/home/jenkins/.m2/repository/log4j/log4j/1.2.17/log4j-1.2.17.jar:/home/jenkins/.m2/repository/net/java/dev/jets3t/jets3t/0.6.1/jets3t-0.6.1.jar:/home/jenkins/.m2/repository/commons-lang/commons-lang/2.5/commons-lang-2.5.jar:/home/jenkins/.m2/repository/commons-configuration/commons-configuration/1.6/commons-configuration-1.6.jar:/home/jenkins/.m2/repository/commons-collections/commons-collections/3.2.1/commons-collections-3.2.1.jar:/home/jenkins/.m2/repository/commons-digester/commons-digester/1.8/commons-digester-1.8.jar:/home/jenkins/.m2/repository/commons-beanutils/commons-beanutils/1.7.0/commons-beanutils-1.7.0.jar:/home/jenkins/.m2/repository/commons-beanutils/commons-beanutils-core/1.8.0/commons-beanutils-core-1.8.0.jar:/home/jenkins/.m2/repository/org/slf4j/slf4j-api/1.7.5/slf4j-api-1.7.5.jar:/home/jenkins/.m2/repository/org/codehaus/jackson/jackson-core-asl/1.8.8/jackson-core-asl-1.8.8.jar:/home/jenkins/.m2/repository/org/codehaus/jackson/jackson-mapper-asl/1.8.8/jackson-mapper-asl-1.8.8.jar:/home/jenkins/.m2/repository/org/apache/avro/avro/1.7.4/avro-1.7.4.jar:/home/jenkins/.m2/repository/com/thoughtworks/paranamer/paranamer/2.3/paranamer-2.3.jar:/home/jenkins/.m2/repository/org/xerial/snappy/snappy-java/1.0.4.1/snappy-java-1.0.4.1.jar:/home/jenkins/.m2/repository/com/google/protobuf/protobuf-java/2.5.0/protobuf-java-2.5.0.jar:/home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-common-project/hadoop-auth/target/classes:/home/jenkins/.m2/repository/com/jcraft/jsch/0.1.42/jsch-0.1.42.jar:/home/jenkins/.m2/repository/org/apache/zookeeper/zookeeper/3.4.5/zookeeper-3.4.5.jar:/home/jenkins/.m2/repository/org/apache/commons/commons-compress/1.4.1/commons-compress-1.4.1.jar:/home/jenkins/.m2/repository/org/tukaani/xz/1.0/xz-1.0.jar
org.apache.hadoop.io.compress.zlib.ZlibCompressor
org.apache.hadoop.io.compress.zlib.ZlibDecompressor
org.apache.hadoop.io.compress.bzip2.Bzip2Compressor
org.apache.hadoop.io.compress.bzip2.Bzip2Decompressor
org.apache.hadoop.security.JniBasedUnixGroupsMapping
org.apache.hadoop.io.nativeio.NativeIO
org.apache.hadoop.security.JniBasedUnixGroupsNetgroupMapping
org.apache.hadoop.io.compress.snappy.SnappyCompressor
org.apache.hadoop.io.compress.snappy.SnappyDecompressor
org.apache.hadoop.io.compress.lz4.Lz4Compressor
org.apache.hadoop.io.compress.lz4.Lz4Decompressor
org.apache.hadoop.util.NativeCrc32
org.apache.hadoop.net.unix.DomainSocket
Error occurred during initialization of VM
java.lang.OutOfMemoryError: unable to create new native thread
--
Alejandro