Hi Gary,
Thanks for the answer. I missed your most recent answer in this thread too.
However, my last question
Averell wrote
> How about changing the configuration of the Flink job itself during
> runtime?
> What I have to do now is to take a savepoint, stop the job, change the
> configuration,
Hi Averell,
The TM containers fetch the Flink binaries and config files form HDFS (or
another DFS if configured) [1]. I think you should be able to change the log
level by patching the logback configuration in HDFS, and kill all Flink
containers on all hosts. If you are running an HA setup, your c
Hi Gary,
Thanks for the suggestion.
How about changing the configuration of the Flink job itself during runtime?
What I have to do now is to take a savepoint, stop the job, change the
configuration, and then restore the job from the save point.
Is there any easier way to do that?
Thanks and r
Hi Averell,
Logback has this feature [1] but is not enabled out of the box. You will
have
to enable the JMX agent by setting the com.sun.management.jmxremote system
property [2][3]. I have not tried this out, though.
Best,
Gary
[1] https://logback.qos.ch/manual/jmxConfig.html
[2]
https://docs.or
Hi Gary,
I am trying to reproduce that problem.
BTW, is that possible to change log level (I'm using logback) for a running
job?
Thanks and regards,
Averell
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hi Averell,
That log file does not look complete. I do not see any INFO level log
messages
such as [1].
Best,
Gary
[1]
https://github.com/apache/flink/blob/46326ab9181acec53d1e9e7ec8f4a26c672fec31/flink-yarn/src/main/java/org/apache/flink/yarn/YarnResourceManager.java#L544
On Fri, Feb 1, 2019 a
Hi Gary,
I faced a similar problem yesterday, but don't know what was the cause yet.
The situation that I observed is as follow:
- At about 2:57, one of my EMR execution node (IP ...99) got disconnected
from YARN resource manager (on RM I could not see that node anymore),
despite that the node wa
Hi Gary,
Thanks for the help.
Gary Yao-3 wrote
> You are writing that it takes YARN 10 minutes to restart the application
> master (AM). However, in my experiments the AM container is restarted
> within a
> few seconds when after killing the process. If in your setup YARN actually
> needs 10 minu
Hi Averell,
> Is there any way to avoid this? As if I run this as an AWS EMR job, the
job
> would be considered failed, while it is actually be restored
automatically by
> YARN after 10 minutes).
You are writing that it takes YARN 10 minutes to restart the application
master (AM). However, in my
Hi Gary,
Yes, my problem mentioned in the original post had been resolved by
correcting the zookeeper connection string.
I have two other relevant questions, if you have time, please help:
1. Regarding JM high availability, when I shut down the host having JM
running, YARN would detect that miss
Hi Averell,
> Then I have another question: when JM cannot start/connect to the JM on
.88,
> why didn't it try on .82 where resource are still available?
When you are deploying on YARN, the TM container placement is decided by the
YARN scheduler and not by Flink. Without seeing the complete logs,
Hi Gary,
Thanks for your support.
I use flink 1.7.0. I will try to test without that -n.
Here below are the JM log (on server .82) and TM log (on server .88). I'm
sorry that I missed that TM log before asking, had a thought that it would
not relevant. I just fixed the issue with connection to zoo
Hi Averell,
What Flink version are you using? Can you attach the full logs from JM and
TMs? Since Flink 1.5, the -n parameter (number of taskmanagers) should be
omitted unless you are in legacy mode [1].
> As per that screenshot, it looks like there are 2 tasks manager still
> running (one on eac
13 matches
Mail list logo