Piotr Nowojski created FLINK-8443: ------------------------------------- Summary: YARNSessionCapacitySchedulerITCase is flakky Key: FLINK-8443 URL: https://issues.apache.org/jira/browse/FLINK-8443 Project: Flink Issue Type: Bug Components: YARN Affects Versions: 1.5.0 Reporter: Piotr Nowojski Attachments: 35.5.tar.gz
Attached build logs from travis. Test(s) is failing with: {noformat} java.lang.AssertionError: Found a file /home/travis/build/dataArtisans/flink/flink-yarn-tests/target/flink-yarn-tests-capacityscheduler/flink-yarn-tests-capacityscheduler-logDir-nm-1_0/application_1516120275777_0003/container_1516120275 777_0003_01_000002/taskmanager.log with a prohibited string (one of [Exception, Started SelectChannelConnector@0.0.0.0:8081]). Excerpts{noformat} After downloading the yarn logs uploaded to transfer.sh there is a following failure: {code:java} 2018-01-16 16:32:10,553 INFO org.apache.flink.yarn.YarnTaskManager - Stopping TaskManager with final application status SUCCEEDED and diagnostics: Flink YARN Client requested shutdown 2018-01-16 16:32:10,577 INFO org.apache.flink.yarn.YarnTaskManager - Stopping TaskManager akka://flink/user/taskmanager#2122015748. 2018-01-16 16:32:10,578 INFO org.apache.flink.yarn.YarnTaskManager - Disassociating from JobManager 2018-01-16 16:32:10,588 INFO org.apache.flink.runtime.blob.PermanentBlobCache - Shutting down BLOB cache 2018-01-16 16:32:10,599 INFO org.apache.flink.runtime.blob.TransientBlobCache - Shutting down BLOB cache 2018-01-16 16:32:10,614 INFO org.apache.flink.runtime.io.disk.iomanager.IOManager - I/O manager removed spill file directory /home/travis/build/dataArtisans/flink/flink-yarn-tests/target/flink-yarn-tests-capacityscheduler/flink-yarn-tests-capacityscheduler-localDir-nm-1_0/usercache/travis/appcache/application_1516120275777_0003/flink-io-356a7c21-a3cd-43cb-926c-7690f861b66c 2018-01-16 16:32:10,615 INFO org.apache.flink.runtime.io.network.NetworkEnvironment - Shutting down the network environment and its components. 2018-01-16 16:32:10,619 INFO org.apache.flink.runtime.io.network.netty.NettyClient - Successful shutdown (took 4 ms). 2018-01-16 16:32:10,623 INFO org.apache.flink.runtime.io.network.netty.NettyServer - Successful shutdown (took 4 ms). 2018-01-16 16:32:10,641 INFO org.apache.flink.yarn.YarnTaskManager - Task manager akka://flink/user/taskmanager is completely shut down. 2018-01-16 16:32:10,649 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Shutting down remote daemon. 2018-01-16 16:32:10,650 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Remote daemon shut down; proceeding with flushing remote transports. 2018-01-16 16:32:10,717 WARN org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline - An exception was thrown by an exception handler. java.util.concurrent.RejectedExecutionException: Worker has already been shutdown at org.apache.flink.shaded.akka.org.jboss.netty.channel.socket.nio.AbstractNioSelector.registerTask(AbstractNioSelector.java:120) at org.apache.flink.shaded.akka.org.jboss.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:72) at org.apache.flink.shaded.akka.org.jboss.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36) at org.apache.flink.shaded.akka.org.jboss.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:56) at org.apache.flink.shaded.akka.org.jboss.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36) at org.apache.flink.shaded.akka.org.jboss.netty.channel.socket.nio.AbstractNioChannelSink.execute(AbstractNioChannelSink.java:34) at org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline.execute(DefaultChannelPipeline.java:636) at org.apache.flink.shaded.akka.org.jboss.netty.channel.Channels.fireExceptionCaughtLater(Channels.java:496) at org.apache.flink.shaded.akka.org.jboss.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:46) at org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline.notifyHandlerException(DefaultChannelPipeline.java:658) at org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:781) at org.apache.flink.shaded.akka.org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:54) at org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591) at org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:784) at org.apache.flink.shaded.akka.org.jboss.netty.channel.SimpleChannelHandler.disconnectRequested(SimpleChannelHandler.java:320) at org.apache.flink.shaded.akka.org.jboss.netty.channel.SimpleChannelHandler.handleDownstream(SimpleChannelHandler.java:274) at org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591) at org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:582) at org.apache.flink.shaded.akka.org.jboss.netty.channel.Channels.disconnect(Channels.java:781) at org.apache.flink.shaded.akka.org.jboss.netty.channel.AbstractChannel.disconnect(AbstractChannel.java:219) at akka.remote.transport.netty.NettyTransport$$anonfun$gracefulClose$1.apply(NettyTransport.scala:241) at akka.remote.transport.netty.NettyTransport$$anonfun$gracefulClose$1.apply(NettyTransport.scala:240) at scala.util.Success.foreach(Try.scala:236) at scala.concurrent.Future$$anonfun$foreach$1.apply(Future.scala:206) at scala.concurrent.Future$$anonfun$foreach$1.apply(Future.scala:206) at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55) at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91) at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72) at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90) at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 2018-01-16 16:32:10,755 INFO org.apache.flink.yarn.YarnTaskManagerRunner - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.}} 2018-01-16 16:32:10,762 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Remoting shut down. 2018-01-16 16:32:10,794 INFO org.apache.flink.yarn.YarnTaskManager - Shutdown completed. Stopping JVM. {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)