Andrey created FLINK-6278: ----------------------------- Summary: After "FATAL" message in logs taskmanager still working Key: FLINK-6278 URL: https://issues.apache.org/jira/browse/FLINK-6278 Project: Flink Issue Type: Bug Affects Versions: 1.2.0 Reporter: Andrey
Steps to reproduce: * create job which cause OOM. For example for each incoming message create byte array and store in memory: {code} }).flatMap(new RichFlatMapFunction<String, AggregatedHash>() { private List<byte[]> oomList = new ArrayList<>(); @Override public void flatMap(String value, Collector<AggregatedHash> out) throws Exception { if (oomHost != null && oomHost.equals(host)) { //multiply speed towards oom oomList.add(new byte[100 * 1024]); } } } {code} * after some time task manager hangs * according to logs task manager will be disconnected from zookeeper (ha mode was configured) and job manager * then big log entry: {code} 2017-04-07 09:13:32,893 ERROR org.apache.flink.runtime.taskmanager.TaskManager - ============================================================== ====================== FATAL ======================= ============================================================== A fatal error occurred, forcing the TaskManager to shut down: Task 'Flat Map -> Sink: Unnamed (2/2)' did not react to cancelling signal in the last 30 seconds, but is stuck in method: org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:182) org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:63) org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:272) org.apache.flink.runtime.taskmanager.Task.run(Task.java:655) java.lang.Thread.run(Thread.java:745) {code} * TM still running. Thread dump attached. Expected: * shutdown task manager -- This message was sent by Atlassian JIRA (v6.3.15#6346)