Re: Per job cluster doesn't shut down after the job is canceled

2018-11-20 Thread Gary Yao
Hi Paul, Sorry for the late reply. I had a look at the attached log. I think FLINK-10482 affects the shut down of the "per-job cluster" after all. Here is the respective stacktrace: 2018-11-06 10:45:17,405 ERROR org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor - Caught exception whil

Re: Per job cluster doesn't shut down after the job is canceled

2018-11-14 Thread Paul Lam
Hi Ufuk, Thanks for you reply! I’m afraid that my case is different. Since the Flink on YARN application is not exited, we do not have an application exit code yet (but the job status is determined). Best, Paul Lam > 在 2018年11月14日,16:49,Ufuk Celebi 写道: > > Hey Paul, > > It might be relat

Re: Per job cluster doesn't shut down after the job is canceled

2018-11-14 Thread Ufuk Celebi
Hey Paul, It might be related to this: https://github.com/apache/flink/pull/7004 (see linked issue for details). Best, Ufuk > On Nov 14, 2018, at 09:46, Paul Lam wrote: > > Hi Gary, > > Thanks for your reply and sorry for the delay. The attachment is the > jobmanager logs after invoking th

Re: Per job cluster doesn't shut down after the job is canceled

2018-11-09 Thread Gary Yao
Hi Paul, Can you share the complete logs, or at least the logs after invoking the cancel command? If you want to debug it yourself, check if MiniDispatcher#jobReachedGloballyTerminalState [1] is invoked, and see how the jobTerminationFuture is used. Best, Gary [1] https://github.com/apache/flin

Per job cluster doesn't shut down after the job is canceled

2018-11-06 Thread Paul Lam
Hi, I’m using Flink 1.5.3, and I’ve seen several times that the detached YARN cluster doesn’t shut down after the job is canceled successfully. The only errors I found in jobmanager’s log are as below (the second one appears multiple times): ``` 2018-11-07 09:48:38,663 WARN org.apache.flink