So is the TaskManager JVM still running after the JM detected that the TM
has gone?
If not, can you check the kernel log (dmesg) to see whether Linux OOM
killer stopped the process? (if its a kill, the JVM might not be able to
log anything anymore)
On Thu, Oct 29, 2015 at 9:27 PM, Stephan Ewen w
Thanks for sharing the logs, Greg!
Okay, so the TaskManager does not crash, but the Remote Failure Detector of
Akka marks the connection between JobManager and TaskManager as broken.
The TaskManager is not doing much GC, so it is not a long JVM freeze that
causes hearbeats to time out...
I am wo
Could it be a problem that there are two TaskManagers running per machine?
> On 29 Oct 2015, at 19:04, Greg Hogan wrote:
>
> I have memory logging enabled. Tail of TaskManager log on 10.0.88.140:
>
> 17:35:26,415 INFO
> org.apache.flink.runtime.taskmanager.TaskManager - Garbage
> c
I have memory logging enabled. Tail of TaskManager log on 10.0.88.140:
17:35:26,415 INFO
org.apache.flink.runtime.taskmanager.TaskManager - Garbage
collector stats: [PS Scavenge, GC TIME (ms): 341, GC COUNT: 3], [PS
MarkSweep, GC TIME (ms): 974, GC COUNT: 1]
17:35:27,415 INFO
org.apac
What does the log of the failed TaskManager 10.0.88.140 say?
On Thu, Oct 29, 2015 at 6:44 PM, Greg Hogan wrote:
> I removed the use of numactl but left in starting two TaskManagers and am
> still seeing TaskManagers crash.
> From the JobManager log:
>
> 17:36:06,412 WARN
> akka.remote.ReliableDe
Forwarding these here to keep dev@ in the loop :)
-- Forwarded message --
From: Martin Junghanns
Date: 29 October 2015 at 18:37
Subject: Re: neo4j - Flink connector
To: Martin Liesenberg , Vasia Kalavri <
vasilikikala...@gmail.com>
Cc: Alexander Keller , Martin Neumann
My idea
I removed the use of numactl but left in starting two TaskManagers and am
still seeing TaskManagers crash.
>From the JobManager log:
17:36:06,412 WARN
akka.remote.ReliableDeliverySupervisor- Association
with remote system [akka.tcp://flink@10.0.88.140:45742] has failed, add
Fabian Hueske created FLINK-2943:
Summary: Confusing Bytes/Records "read" and "write" labels in
WebUI job view
Key: FLINK-2943
URL: https://issues.apache.org/jira/browse/FLINK-2943
Project: Flink
Hi Greg!
Interesting... When you say the TaskManagers are dropping, are the
TaskManager processes crashing, or are they loosing connection to the
JobManager?
Greetings,
Stephan
On Thu, Oct 29, 2015 at 9:56 AM, Greg Hogan wrote:
> I recently discovered that AWS uses NUMA for its largest nodes.
I recently discovered that AWS uses NUMA for its largest nodes. An example
c4.8xlarge:
$ numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 18 19 20 21 22 23 24 25 26
node 0 size: 29813 MB
node 0 free: 24537 MB
node 1 cpus: 9 10 11 12 13 14 15 16 17 27 28 29 30 31 32 33 34
Fabian Hueske created FLINK-2942:
Summary: Dangling operators in web UI's program visualization
Key: FLINK-2942
URL: https://issues.apache.org/jira/browse/FLINK-2942
Project: Flink
Issue Type
Hi Matthias,
There is currently no cancel button in the web frontend. Just filed this
ticket today: https://issues.apache.org/jira/browse/FLINK-2939
Cheers,
Max
On Thu, Oct 29, 2015 at 4:49 PM, Matthias J. Sax wrote:
> Hi,
>
> I was just playing with the new JobManager web frontend and missing
Hi,
I was just playing with the new JobManager web frontend and missing a
button to cancel a running job. It there no such button, or is it hidden
somewhere?
-Matthias
signature.asc
Description: OpenPGP digital signature
Hello everyone,
Martin, Martin, Alex (cc'ed) and myself have started discussing about
implementing a neo4j-Flink connector. I've opened a corresponding JIRA
(FLINK-2941) containing an initial document [1], but we'd also like to
share our ideas here to engage the community and get your feedback.
W
Thanks Max ^^
On Wed, Oct 28, 2015 at 8:41 PM, Maximilian Michels wrote:
> Oups, forgot the mapper :)
>
> static class StatefulMapper extends RichMapFunction Long>, Tuple2> {
>
>private OperatorState counter;
>
>@Override
>public Tuple2 map(Tuple2 value) throws
> Exception {
>
Hi Greg,
Thanks for reporting. You wrote you didn't see any output in the .out files
of the task managers. What about the .log files of these instances?
Where and when did you produce the thread dump you included?
Thanks,
Max
On Thu, Oct 29, 2015 at 1:46 PM, Greg Hogan wrote:
> I am testing a
I am testing again on a 64 node cluster (the JobManager is running fine
having reduced some operator's parallelism and fixed the string conversion
performance).
I am seeing TaskManagers drop like flies every other job or so. I am not
seeing any output in the .out log files corresponding to the cra
Vasia Kalavri created FLINK-2941:
Summary: Implement a neo4j - Flink/Gelly connector
Key: FLINK-2941
URL: https://issues.apache.org/jira/browse/FLINK-2941
Project: Flink
Issue Type: New Featu
Seems like we agree that we need artifacts for different versions of Scala
on Maven. There also seems to be a preference for including the version in
the artifact name.
I've created an issue and marked it to be resolved for 1.0. For the 0.10
release, we will have binaries but no Maven artifacts. T
Maximilian Michels created FLINK-2940:
-
Summary: Deploy multiple Scala versions for Maven artifacts
Key: FLINK-2940
URL: https://issues.apache.org/jira/browse/FLINK-2940
Project: Flink
Is
Maximilian Michels created FLINK-2939:
-
Summary: Add button to cancel jobs in new web frontend
Key: FLINK-2939
URL: https://issues.apache.org/jira/browse/FLINK-2939
Project: Flink
Issue T
Maximilian Michels created FLINK-2938:
-
Summary: Streaming docs not in sink with latest state changes
Key: FLINK-2938
URL: https://issues.apache.org/jira/browse/FLINK-2938
Project: Flink
Theodore Vasiloudis created FLINK-2937:
--
Summary: Typo in Quickstart->Scala API->Alternative Build Tools:
SBT
Key: FLINK-2937
URL: https://issues.apache.org/jira/browse/FLINK-2937
Project: Flink
23 matches
Mail list logo