Re: Streaming job failure due to loss of Taskmanagers

2016-04-01 Thread Maximilian Michels
Hi Ravinder, It would be interesting to see the log output of the disconnected task manager. Possibly, the task manager ran out of memory because the state of your program got too big. You can work around this by using a different state backend like RocksDB. Cheers, Max On Wed, Mar 23, 2016 at 1

Re: Streaming job failure due to loss of Taskmanagers

2016-03-21 Thread Ufuk Celebi
Hey Ravinder, can you please share the JobManager logs as well? The logs say that the TaskManager disconnects from the JobManager, because that one is not reachable anymore. At this point, the running shuffles are cancelled and you see the follow up RemoteTransportExceptions. – Ufuk On Mon, Ma

Streaming job failure due to loss of Taskmanagers

2016-03-21 Thread Ravinder Kaur
Hello All, I'm running the WordCount example streaming job and it fails because of loss of Taskmanagers. When gone through the logs of the taskmanager it has the following messages 15:14:26,592 INFO org.apache.flink.streaming.runtime.tasks.StreamTask - State backend is set to heap memory