Re: best practices on getting flink job logs from Hadoop history server?

Yu Yang Thu, 05 Sep 2019 12:07:28 -0700

Hi Yun Tang & Zhu Zhu,

Thanks for the reply!  With your current approach, we will still need to
search job manager log / yarn client log to find information on job
id/vertex id --> yarn container id mapping. I am wondering howe we can
propagate this kind of information to Flink execution graph so that it can
stored under flink history server's archived execution graph. Any
suggestions about that?


-Yu

On Fri, Aug 30, 2019 at 2:21 AM Yun Tang <myas...@live.com> wrote:

> Hi  Yu
>
> If you have client job log and you could find your application id from
> below description:
>
> The Flink YARN client has been started in detached mode. In order to stop
> Flink on YARN, use the following command or a YARN web interface to stop it:
> yarn application -kill {appId}
> Please also note that the temporary files of the YARN session in the home
> directory will not be removed.
>
> Best
> Yun Tang
>
> ------------------------------
> *From:* Zhu Zhu <reed...@gmail.com>
> *Sent:* Friday, August 30, 2019 16:24
> *To:* Yu Yang <yuyan...@gmail.com>
> *Cc:* user <user@flink.apache.org>
> *Subject:* Re: best practices on getting flink job logs from Hadoop
> history server?
>
> Hi Yu,
>
> Regarding #2,
> Currently we search task deployment log in JM log, which contains info of
> the container and machine the task deploys to.
>
> Regarding #3,
> You can find the application logs aggregated by machines on DFS, this path
> of which relies on your YARN config.
> Each log may still include multiple TM logs. However it can be much
> smaller than the "yarn logs ..." generated log.
>
> Thanks,
> Zhu Zhu
>
> Yu Yang <yuyan...@gmail.com> 于2019年8月30日周五 下午3:58写道：
>
> Hi,
>
> We run flink jobs through yarn on hadoop clusters. One challenge that we
> are facing is to simplify flink job log access.
>
> The flink job logs can be accessible using "yarn logs $application_id".
> That approach has a few limitations:
>
>    1. It is not straightforward to find yarn application id based on
>    flink job id.
>    2. It is difficult to find the corresponding container id for the
>    flink sub tasks.
>    3. For jobs that have many tasks, it is inefficient to use "yarn logs
>    ..."  as it mixes logs from all task managers.
>
> Any suggestions on the best practice to get logs for completed flink job
> that run on yarn?
>
> Regards,
> -Yu
>
>
>

Re: best practices on getting flink job logs from Hadoop history server?

Reply via email to