Hello Piotrek,

thank you for your answer. I installed a Flink on a local cluster and used the GUI in order to monitor the task managers. It seems the program *d**oes not start at all*. The whole time just the job manager is struggling... For very very toy examples, after a long time (during this time I see the job manager logs as I mentioned before),  the job is started and can be executed in 2 seconds.

Best,

Alieh


On 12/07/2018 10:43 AM, Piotr Nowojski wrote:
Hi,

Please investigate logs/standard output/error from the task manager that has 
failed (the logs that you showed are from job manager). Probably there is some 
obvious error/exception explaining why has it failed. Most common reasons:
- out of memory
- long GC pause
- seg fault or other error from some native library
- task manager killed via for example SIGKILL

Piotrek

On 6 Dec 2018, at 17:34, Alieh <sae...@informatik.uni-leipzig.de> wrote:

Hello all,

I have an algorithm x () which contains several joins and usage of 3 times of 
gelly ConnectedComponents. The problem is that if I call x() inside a script 
more than three times, I receive the messages listed below in the log and the 
program is somehow stopped. It happens even if I run it with a toy example of a 
graph with less that 10 vertices. Do you have any clue what is the problem?

Cheers,

Alieh


129149 [flink-akka.actor.default-dispatcher-20] DEBUG 
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Trigger 
heartbeat request.
129149 [flink-akka.actor.default-dispatcher-20] DEBUG 
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Trigger 
heartbeat request.
129150 [flink-akka.actor.default-dispatcher-20] DEBUG 
org.apache.flink.runtime.taskexecutor.TaskExecutor  - Received heartbeat 
request from e80ec35f3d0a04a68000ecbdc555f98b.
129150 [flink-akka.actor.default-dispatcher-22] DEBUG 
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Received 
heartbeat from 78cdd7a4-0c00-4912-992f-a2990a5d46db.
129151 [flink-akka.actor.default-dispatcher-22] DEBUG 
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Received 
new slot report from TaskManager 78cdd7a4-0c00-4912-992f-a2990a5d46db.
129151 [flink-akka.actor.default-dispatcher-22] DEBUG 
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Received 
slot report from instance 4c3e3654c11b09fbbf8e993a08a4c2da.
129200 [flink-akka.actor.default-dispatcher-15] DEBUG 
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Release 
TaskExecutor 4c3e3654c11b09fbbf8e993a08a4c2da because it exceeded the idle 
timeout.
129200 [flink-akka.actor.default-dispatcher-15] DEBUG 
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Worker 
78cdd7a4-0c00-4912-992f-a2990a5d46db could not be stopped.



Reply via email to