Hi Yun Tang & Zhu Zhu, Thanks for the reply! With your current approach, we will still need to search job manager log / yarn client log to find information on job id/vertex id --> yarn container id mapping. I am wondering howe we can propagate this kind of information to Flink execution graph so that it can stored under flink history server's archived execution graph. Any suggestions about that?
-Yu On Fri, Aug 30, 2019 at 2:21 AM Yun Tang <myas...@live.com> wrote: > Hi Yu > > If you have client job log and you could find your application id from > below description: > > The Flink YARN client has been started in detached mode. In order to stop > Flink on YARN, use the following command or a YARN web interface to stop it: > yarn application -kill {appId} > Please also note that the temporary files of the YARN session in the home > directory will not be removed. > > Best > Yun Tang > > ------------------------------ > *From:* Zhu Zhu <reed...@gmail.com> > *Sent:* Friday, August 30, 2019 16:24 > *To:* Yu Yang <yuyan...@gmail.com> > *Cc:* user <user@flink.apache.org> > *Subject:* Re: best practices on getting flink job logs from Hadoop > history server? > > Hi Yu, > > Regarding #2, > Currently we search task deployment log in JM log, which contains info of > the container and machine the task deploys to. > > Regarding #3, > You can find the application logs aggregated by machines on DFS, this path > of which relies on your YARN config. > Each log may still include multiple TM logs. However it can be much > smaller than the "yarn logs ..." generated log. > > Thanks, > Zhu Zhu > > Yu Yang <yuyan...@gmail.com> 于2019年8月30日周五 下午3:58写道: > > Hi, > > We run flink jobs through yarn on hadoop clusters. One challenge that we > are facing is to simplify flink job log access. > > The flink job logs can be accessible using "yarn logs $application_id". > That approach has a few limitations: > > 1. It is not straightforward to find yarn application id based on > flink job id. > 2. It is difficult to find the corresponding container id for the > flink sub tasks. > 3. For jobs that have many tasks, it is inefficient to use "yarn logs > ..." as it mixes logs from all task managers. > > Any suggestions on the best practice to get logs for completed flink job > that run on yarn? > > Regards, > -Yu > > >