Re: best practices on getting flink job logs from Hadoop history server?

Yun Tang Fri, 30 Aug 2019 02:22:18 -0700

Hi  Yu

If you have client job log and you could find your application id from below 
description:


The Flink YARN client has been started in detached mode. In order to stop Flink 
on YARN, use the following command or a YARN web interface to stop it:
yarn application -kill {appId}
Please also note that the temporary files of the YARN session in the home 
directory will not be removed.

Best
Yun Tang

________________________________
From: Zhu Zhu <reed...@gmail.com>
Sent: Friday, August 30, 2019 16:24
To: Yu Yang <yuyan...@gmail.com>
Cc: user <user@flink.apache.org>
Subject: Re: best practices on getting flink job logs from Hadoop history 
server?

Hi Yu,

Regarding #2,
Currently we search task deployment log in JM log, which contains info of the 
container and machine the task deploys to.

Regarding #3,
You can find the application logs aggregated by machines on DFS, this path of 
which relies on your YARN config.
Each log may still include multiple TM logs. However it can be much smaller 
than the "yarn logs ..." generated log.

Thanks,
Zhu Zhu

Yu Yang <yuyan...@gmail.com<mailto:yuyan...@gmail.com>> 于2019年8月30日周五 下午3:58写道：
Hi,

We run flink jobs through yarn on hadoop clusters. One challenge that we are 
facing is to simplify flink job log access.

The flink job logs can be accessible using "yarn logs $application_id". That 
approach has a few limitations:

  1.  It is not straightforward to find yarn application id based on flink job 
id.
  2.  It is difficult to find the corresponding container id for the flink sub 
tasks.
  3.  For jobs that have many tasks, it is inefficient to use "yarn logs ..."  
as it mixes logs from all task managers.

Any suggestions on the best practice to get logs for completed flink job that 
run on yarn?

Regards,
-Yu

Re: best practices on getting flink job logs from Hadoop history server?

Reply via email to