Please find the logs here
http://www2.informatik.hu-berlin.de/~saxmatti/flink-mjsax-jobmanager-0-dbis21.log
http://www2.informatik.hu-berlin.de/~saxmatti/flink-mjsax-taskmanager-0-dbis34.log
-Matthias
On 10/15/2015 12:16 PM, Stephan Ewen wrote:
> Blocking actor calls should not be an issue (eve
Blocking actor calls should not be an issue (even if they are there),
because the heartbeats go between the actor systems, rather than the
actors...
On Thu, Oct 15, 2015 at 12:14 PM, Till Rohrmann
wrote:
> And please set akka.log.lifecycle.events: true to let Akka log also its
> lifecycle events
And please set akka.log.lifecycle.events: true to let Akka log also its
lifecycle events.
On Thu, Oct 15, 2015 at 12:12 PM, Robert Metzger
wrote:
> Can you start flink with logging level DEBUG ?
> Then we can see from the TaskManager logs when the TM became inactive.
> Maybe an Akka message is
Can you start flink with logging level DEBUG ?
Then we can see from the TaskManager logs when the TM became inactive.
Maybe an Akka message is causing the actor to block?
You can also monitor the GC from the TaskManager view in the web interface
(for example by looking at the total time spend for
Does not quite sound like GC is an issue.
Hmmm, what else can make the failure detector kick in unexpectedly?
On Thu, Oct 15, 2015 at 12:05 PM, Till Rohrmann
wrote:
> To verify wether GC is a problem you can enable logging of memory usage of
> the JVM via taskmanager.debug.memory.startLogThread
To verify wether GC is a problem you can enable logging of memory usage of
the JVM via taskmanager.debug.memory.startLogThread: true. The interval of
the logging is configured via taskmanager.debug.memory.logIntervalMs.
On Thu, Oct 15, 2015 at 12:00 PM, Matthias J. Sax wrote:
> The problem is
The problem is reproducible (it happens on each run).
I doubt that GC is an issue here (at least from an UDF point of view),
because I read the file once and keep a String object for each line.
This objects are kept to the very end; the UDF does not release them
until it returns from "run()" metho
>From what the logs show, the TaskManager does not send pings any more for a
long time and is then considered failed and the tasks running on that
TaskManager are considered failed as well. So far, nothing unusual...
Question is, why is it considered failed? Is this a reproducible problem?
Or a on
One thing I forgot the add. I also have a Storm-WordCount job (build via
FlinkTopologyBuilder) that uses the same
"buffer-file-and-emit-over-and-over-again-pattern" in a spout. This job
run just fine and stops regularly after 5 minutes.
-Matthias
On 10/14/2015 10:42 PM, Matthias J. Sax wrote:
>
No. See log below.
Btw: the job is not cleaned up properly. Some task remain in state
"Canceling".
The program I execute is "Streaming WordCount" example with my own
source function. This custom source (see below), reads a local (small)
file, bufferes each line in an internal buffer, and emits th
> On 11 Oct 2015, at 23:54, Stephan Ewen wrote:
>
> Can you see is there is anything unusual in the JobManager logs?
Ping. :)
Can you see is there is anything unusual in the JobManager logs?
Am 11.10.2015 18:56 schrieb "Matthias J. Sax" :
> Hi,
>
> I was just playing arround with Flink. After submitting my job, it runs
> for multiple minutes, until I get the following Exception in one if the
> TaskManager logs and the jo
Hi,
I was just playing arround with Flink. After submitting my job, it runs
for multiple minutes, until I get the following Exception in one if the
TaskManager logs and the job fails.
I have no clue what's going on...
-Matthias
> 18:43:23,567 WARN akka.remote.RemoteWatcher
13 matches
Mail list logo