Stefan Bunk created FLINK-1830: ---------------------------------- Summary: java.io.IOException: Network stream corrupted Key: FLINK-1830 URL: https://issues.apache.org/jira/browse/FLINK-1830 Project: Flink Issue Type: Bug Affects Versions: 0.8.1 Reporter: Stefan Bunk
When running my Flink job I get the following error: {quote} 04.Apr. 20:43:12 WARN DefaultChannelPipeline - An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception. java.io.IOException: Network stream corrupted: invalid magicnumber in current envelope header. at org.apache.flink.runtime.io.network.netty.InboundEnvelopeDecoder.decodeEnvelope(InboundEnvelopeDecoder.java:239) at org.apache.flink.runtime.io.network.netty.InboundEnvelopeDecoder.decodeBuffer(InboundEnvelopeDecoder.java:127) at org.apache.flink.runtime.io.network.netty.InboundEnvelopeDecoder.channelRead(InboundEnvelopeDecoder.java:111) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:125) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) at java.lang.Thread.run(Thread.java:745) {quote} Sometimes the job works, sometimes it fails with the above error. When it fails, the job still appears as running, but nothing happens anymore until I cancel it manually. In the logs I can then find the error, often repeated hundreds of times. -- This message was sent by Atlassian JIRA (v6.3.4#6332)