Hi, Please investigate logs/standard output/error from the task manager that has failed (the logs that you showed are from job manager). Probably there is some obvious error/exception explaining why has it failed. Most common reasons: - out of memory - long GC pause - seg fault or other error from some native library - task manager killed via for example SIGKILL
Piotrek > On 6 Dec 2018, at 17:34, Alieh <sae...@informatik.uni-leipzig.de> wrote: > > Hello all, > > I have an algorithm x () which contains several joins and usage of 3 times of > gelly ConnectedComponents. The problem is that if I call x() inside a script > more than three times, I receive the messages listed below in the log and the > program is somehow stopped. It happens even if I run it with a toy example of > a graph with less that 10 vertices. Do you have any clue what is the problem? > > Cheers, > > Alieh > > > 129149 [flink-akka.actor.default-dispatcher-20] DEBUG > org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Trigger > heartbeat request. > 129149 [flink-akka.actor.default-dispatcher-20] DEBUG > org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Trigger > heartbeat request. > 129150 [flink-akka.actor.default-dispatcher-20] DEBUG > org.apache.flink.runtime.taskexecutor.TaskExecutor - Received heartbeat > request from e80ec35f3d0a04a68000ecbdc555f98b. > 129150 [flink-akka.actor.default-dispatcher-22] DEBUG > org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Received > heartbeat from 78cdd7a4-0c00-4912-992f-a2990a5d46db. > 129151 [flink-akka.actor.default-dispatcher-22] DEBUG > org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Received > new slot report from TaskManager 78cdd7a4-0c00-4912-992f-a2990a5d46db. > 129151 [flink-akka.actor.default-dispatcher-22] DEBUG > org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Received > slot report from instance 4c3e3654c11b09fbbf8e993a08a4c2da. > 129200 [flink-akka.actor.default-dispatcher-15] DEBUG > org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Release > TaskExecutor 4c3e3654c11b09fbbf8e993a08a4c2da because it exceeded the idle > timeout. > 129200 [flink-akka.actor.default-dispatcher-15] DEBUG > org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Worker > 78cdd7a4-0c00-4912-992f-a2990a5d46db could not be stopped. >