So you're saying regarding the release you don't feel very confident that we manage to fork off release-0.9 next week?
The exceptions in the jobmanager-stderr from the YARN tests is the following (from #347.5 and #344.5): 07:46:00,598 WARN org.apache.flink.yarn.YarnTestBase - LINE: Mar 12, 2015 7:45:57 AM org.jboss.netty.channel.DefaultChannelPipeline 07:46:00,598 WARN org.apache.flink.yarn.YarnTestBase - LINE: WARNING: An exception was thrown by an exception handler. 07:46:00,598 WARN org.apache.flink.yarn.YarnTestBase - LINE: java.util.concurrent.RejectedExecutionException: Worker has already been shutdown 07:46:00,598 WARN org.apache.flink.yarn.YarnTestBase - LINE: at org.jboss.netty.channel.socket.nio.AbstractNioSelector.registerTask(AbstractNioSelector.java:120) 07:46:00,598 WARN org.apache.flink.yarn.YarnTestBase - LINE: at org.jboss.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:72) 07:46:00,598 WARN org.apache.flink.yarn.YarnTestBase - LINE: at org.jboss.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36) 07:46:00,598 WARN org.apache.flink.yarn.YarnTestBase - LINE: at org.jboss.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:56) 07:46:00,598 WARN org.apache.flink.yarn.YarnTestBase - LINE: at org.jboss.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36) 07:46:00,598 WARN org.apache.flink.yarn.YarnTestBase - LINE: at org.jboss.netty.channel.socket.nio.AbstractNioChannelSink.execute(AbstractNioChannelSink.java:34) 07:46:00,598 WARN org.apache.flink.yarn.YarnTestBase - LINE: at org.jboss.netty.channel.Channels.fireExceptionCaughtLater(Channels.java:496) 07:46:00,598 WARN org.apache.flink.yarn.YarnTestBase - LINE: at org.jboss.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:46) 07:46:00,598 WARN org.apache.flink.yarn.YarnTestBase - LINE: at org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:54) 07:46:00,598 WARN org.apache.flink.yarn.YarnTestBase - LINE: at org.jboss.netty.channel.Channels.disconnect(Channels.java:781) 07:46:00,598 WARN org.apache.flink.yarn.YarnTestBase - LINE: at org.jboss.netty.channel.AbstractChannel.disconnect(AbstractChannel.java:211) 07:46:00,598 WARN org.apache.flink.yarn.YarnTestBase - LINE: at akka.remote.transport.netty.NettyTransport$$anonfun$gracefulClose$1.apply(NettyTransport.scala:223) On Thu, Mar 12, 2015 at 9:51 AM, Ufuk Celebi <u...@apache.org> wrote: > On Tue, Mar 10, 2015 at 11:20 AM, Robert Metzger <rmetz...@apache.org> > wrote: > > > I think > > it is time to evaluate whether we are confident that the master is > stable. > > > > In the course of finishing up #471 [1] I ran 20 Travis builds over night, > of which 7 failed. > > The (unexpected) failing test cases: > > - ExternalSortITCase.testSpillingSortWithIntermediateMerge:325 Field 0 is > null, but expected to hold a key. > - JobManagerProcessReapingTest.testReapProcessOnFailure:121 JobManager > process did not launch the JobManager properly. Failed to look up > > The (expected/known-to-fail) failing test cases: > > - TaskManagerFailsITCase => will be fixed with Shading? > - YARN test cases => polluted logs (unrelated to YARN)? > > Can people, who are familiar with the test cases confirm/explain that the > failures are known. Details about failing builds below. > > (One of the failures is related to the changes in my PR.) > > [1] https://github.com/apache/flink/pull/471 > > ---- > > #327: https://travis-ci.org/uce/incubator-flink/builds/53985832 > - 327.1 ( > https://s3.amazonaws.com/archive.travis-ci.org/jobs/53985834/log.txt): > ExternalSortITCase.testSpillingSortWithIntermediateMerge:325 Field 0 is > null, but expected to hold a key. > - 327.4 ( > https://s3.amazonaws.com/archive.travis-ci.org/jobs/53985838/log.txt): > YARNSessionFIFOITCase => exception in taskmanager-strerr.log file > > #331: https://travis-ci.org/uce/incubator-flink/builds/53985889 > - 331.2 ( > https://s3.amazonaws.com/archive.travis-ci.org/jobs/53985892/log.txt): > Failed due to a change in my PR > - 332.3 ( > https://s3.amazonaws.com/archive.travis-ci.org/jobs/53985893/log.txt): > TaskManagerFailsITCase => expected class > org.apache.flink.runtime.messages.JobManagerMessages$JobResultSuccess, > found class akka.actor.Status$Failure > > #332: https://travis-ci.org/uce/incubator-flink/builds/53985900 > - 332.3 ( > https://s3.amazonaws.com/archive.travis-ci.org/jobs/53985903/log.txt): > TaskManagerFailsITCase => expected class > org.apache.flink.runtime.messages.JobManagerMessages$JobResultSuccess, > found class akka.actor. > > #338: https://travis-ci.org/uce/incubator-flink/builds/53985981 > - 338.5 ( > https://s3.amazonaws.com/archive.travis-ci.org/jobs/53985986/log.txt): > Failed due to a change in my PR > > #344. https://travis-ci.org/uce/incubator-flink/builds/53986054 > - 344.5 ( > https://s3.amazonaws.com/archive.travis-ci.org/jobs/53986059/log.txt): > YARNSessionFIFOITCase => exception in taskmanager-strerr.log file > > #346. https://travis-ci.org/uce/incubator-flink/builds/53986071 > - 346.3 ( > https://s3.amazonaws.com/archive.travis-ci.org/jobs/53986080/log.txt): > JobManagerProcessReapingTest.testReapProcessOnFailure:121 JobManager > process did not launch the JobManager properly. Failed to look up > JobManager actor at localhost:57964 > > #347. https://travis-ci.org/uce/incubator-flink/builds/53986111 > - 347.5 ( > https://s3.amazonaws.com/archive.travis-ci.org/jobs/53986116/log.txt): > YARNSessionCapacitySchedulerITCase => exception in jobmanager-strerr.log > file >