Hi! Concerning (1) We have seen that a few times. The JVMs / Threads do sometimes not properly exit in a graceful way, and YARN is not always able to kill the process (YARN bug). I am currently working on a refactoring of the YARN resource manager (to allow to easy addition of other frameworks) and have addressed this as part of that. Will be in the master in a bit.
Concerning (2) Do you know which component in Flink uses the HTTP client? Greetings, Stephan On Tue, Jan 5, 2016 at 2:49 PM, Maximilian Bode <maximilian.b...@tngtech.com > wrote: > Hi everyone, > > Regarding Q1, I believe I have witnessed a comparable phenomenon in a > (3-node, non-EMR) YARN cluster. After shutting down the yarn session via > `stop`, one container seems to linger around. `yarn application -list` is > empty, whereas `bin/yarn-session.sh -q` lists the left-over container. > Also, there is still one application shown as ‚running‘ in Ambari’s YARN > pane under current applications. Then, after some time (order of a few > minutes) it disappears and the resources are available again. > > I have not tested this behavior extensibly so far. Noticeably, I was not > able to reproduce it by just starting a session and then ending it again > right away without looking at the JobManager web interface. Maybe this > produces some kind of lag as far as YARN containers are concerned? > > Cheers, > Max > > > Am 04.01.2016 um 12:52 schrieb Chiwan Park <chiwanp...@apache.org>: > > > > Hi All, > > > > I have some problems using Flink on Amazon EMR cluster. > > > > Q1. Sometimes, jobmanager container still exists after destroying yarn > session by pressing Ctrl+C. In that case, Flink YARN app seems exited > correctly in YARN RM dashboard. But there is a running container in the > dashboard. From logs of the container, I realize that the container is > jobmanager. > > > > I cannot kill the container because there is no permission to restart > YARN RM in Amazon EMR. In my small Hadoop Cluster (w/3 nodes), the problem > doesn’t appear. > > > > Q2. I tried to use S3 file system in Flink on EMR. But I can’t use it > because of version conflict of Apache Httpclient. In default, > implementation of S3 file system in EMR is > `com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem` which is linked with > other version of Apache Httpclient. > > > > As I wrote above, I cannot restart Hadoop cluster after modifying > conf-site.xml because of lack of permission. How can I solve this problem? > > > > Regards, > > Chiwan Park > > > > > >