[ https://issues.apache.org/jira/browse/FLINK-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Metzger updated FLINK-2065: ---------------------------------- Description: While running some experiments, I've noticed that jobs sometimes finish in FAILED, even though I've cancelled them. The reported error is {code} hdp22-kafka-w-0.c.astral-sorter-757.internal Error: java.lang.IllegalStateException: Buffer has already been recycled. at org.apache.flink.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:173) at org.apache.flink.runtime.io.network.buffer.Buffer.ensureNotRecycled(Buffer.java:142) at org.apache.flink.runtime.io.network.buffer.Buffer.getMemorySegment(Buffer.java:78) at org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.setNextBuffer(SpillingAdaptiveSpanningRecordDeserializer.java:72) at org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:80) at org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:34) at org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:73) at org.apache.flink.runtime.operators.MapDriver.run(MapDriver.java:96) at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) at java.lang.Thread.run(Thread.java:745) {code} The logs: {code} 16:29:37,212 INFO org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 - Trying to cancel job with ID ecccf02327c70c9e35770c6da37638e1. 16:29:37,214 INFO org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 - Status of job ecccf02327c70c9e35770c6da37638e1 (Simple big union) changed to CANCELLING . 16:31:15,581 INFO org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 - Status of job ecccf02327c70c9e35770c6da37638e1 (Simple big union) changed to FAILING Buffer has already been recycled.. {code} was: While running some experiments, I've noticed that jobs sometimes finish in FAILED, even though I've cancelled them. The > Cancelled jobs finish with final state FAILED > --------------------------------------------- > > Key: FLINK-2065 > URL: https://issues.apache.org/jira/browse/FLINK-2065 > Project: Flink > Issue Type: Bug > Components: Distributed Runtime > Affects Versions: 0.9 > Reporter: Robert Metzger > > While running some experiments, I've noticed that jobs sometimes finish in > FAILED, even though I've cancelled them. > The reported error is > {code} > hdp22-kafka-w-0.c.astral-sorter-757.internal > Error: java.lang.IllegalStateException: Buffer has already been recycled. > at > org.apache.flink.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:173) > at > org.apache.flink.runtime.io.network.buffer.Buffer.ensureNotRecycled(Buffer.java:142) > at > org.apache.flink.runtime.io.network.buffer.Buffer.getMemorySegment(Buffer.java:78) > at > org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.setNextBuffer(SpillingAdaptiveSpanningRecordDeserializer.java:72) > at > org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:80) > at > org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:34) > at > org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:73) > at org.apache.flink.runtime.operators.MapDriver.run(MapDriver.java:96) > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496) > at > org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) > at java.lang.Thread.run(Thread.java:745) > {code} > The logs: > {code} > 16:29:37,212 INFO org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 > - Trying to cancel job with ID ecccf02327c70c9e35770c6da37638e1. > 16:29:37,214 INFO org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 > - Status of job ecccf02327c70c9e35770c6da37638e1 (Simple big union) > changed to CANCELLING . > 16:31:15,581 INFO org.apache.flink.yarn.ApplicationMaster$$anonfun$2$$anon$1 > - Status of job ecccf02327c70c9e35770c6da37638e1 (Simple big union) > changed to FAILING Buffer has already been recycled.. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)