If you are using your own deployer(aka a java program calls the Flink client API to submit Flink jobs), you need to check the jobmanager configuration in webUI whether " $internal.yarn.log-config-file" is correctly set. If not, maybe you need to set " $internal.deployment.config-dir" in your deployer, not simply set the FLINK_CONF_DIR environment. Because your deployer needs to do some configuration setting which CliFrontend has done. Please have a try and share more feedback.
Best, Yang 马阳阳 <ma_yang_y...@163.com> 于2020年11月16日周一 下午2:47写道: > Hi Yang, > We run a self-compiled Flink-1.12-SNAPSHOT, and could not see any > taskmanager/jobmanager logs. > I have checked the log4j.properties file, and it's in the right format. > And the FLINK_CONF_DIR is set. > When checking the java dynamic options of task manager, I found that the > log related options are not > set. > This is the output when ussuing "ps -ef | grep <container_id>". > > yarn 31049 30974 9 13:57 ? 00:03:31 > /usr/lib/jvm/jdk1.8.0_121/bin/java -Xmx536870902 -Xms536870902 > -XX:MaxDirectMemorySize=268435458 -XX:MaxMetaspaceSize=268435456 > org.apache.flink.yarn.YarnTaskExecutorRunner -D > taskmanager.memory.framework.off-heap.size=134217728b -D > taskmanager.memory.network.max=134217730b -D > taskmanager.memory.network.min=134217730b -D > taskmanager.memory.framework.heap.size=134217728b -D > taskmanager.memory.managed.size=536870920b -D taskmanager.cpu.cores=1.0 -D > taskmanager.memory.task.heap.size=402653174b -D > taskmanager.memory.task.off-heap.size=0b --configDir . > -Djobmanager.rpc.address=dhpdn09-113 > -Dtaskmanager.resource-id=container_1604585185669_635512_01_000713 > -Dweb.port=0 > -Dweb.tmpdir=/tmp/flink-web-1d373ec2-0cbe-49b8-9592-3ac1d207ad63 > -Djobmanager.rpc.port=40093 -Drest.address=dhpdn09-113 > > My question is, what maybe the problem for this? And any suggestions? > > By the way, we submit the program from Java program instead of from the > command line. > > Thanks. > > ps: I sent the mail to spark user mail list un-attentionally. So I resent > it to the Flink user mail list. Sorry for the inconvenience to @Yang Wang > > > > > > > > At 2020-11-03 20:56:19, "Yang Wang" <danrtsey...@gmail.com> wrote: > > You could issue "ps -ef | grep container_id_for_some_tm". And then you > will find the > following java options about log4j. > > > -Dlog.file=/var/log/hadoop-yarn/containers/application_xx/container_xx/taskmanager.log > -Dlog4j.configuration=file:./log4j.properties > -Dlog4j.configurationFile=file:./log4j.properties > > Best, > Yang > > Diwakar Jha <diwakar.n...@gmail.com> 于2020年11月2日周一 下午11:37写道: > >> Sure. I will check that and get back to you. could you please share how >> to check java dynamic options? >> >> Best, >> Diwakar >> >> On Mon, Nov 2, 2020 at 1:33 AM Yang Wang <danrtsey...@gmail.com> wrote: >> >>> If you have already updated the log4j.properties, and it still could not >>> work, then I >>> suggest to log in the Yarn NodeManager machine and check the >>> log4j.properties >>> in the container workdir is correct. Also you could have a look at the >>> java dynamic >>> options are correctly set. >>> >>> I think it should work if the log4j.properties and java dynamic options >>> are set correctly. >>> >>> BTW, could you share the new yarn logs? >>> >>> Best, >>> Yang >>> >>> Diwakar Jha <diwakar.n...@gmail.com> 于2020年11月2日周一 下午4:32写道: >>> >>>> >>>> >>>> Hi Yang, >>>> >>>> Thank you so much for taking a look at the log files. I changed my >>>> log4j.properties. Below is the actual file that I got from EMR 6.1.0 >>>> distribution of flink 1.11. I observed that it is different from Flink 1.11 >>>> that i downloaded so i changed it. Still I didn't see any logs. >>>> >>>> *Actual* >>>> log4j.rootLogger=INFO,file >>>> >>>> # Log all infos in the given file >>>> log4j.appender.file=org.apache.log4j.FileAppender >>>> log4j.appender.file.file=${log.file} >>>> log4j.appender.file.append=false >>>> log4j.appender.file.layout=org.apache.log4j.PatternLayout >>>> log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd >>>> HH:mm:ss,SSS} %-5p %-60c %x - %m%n >>>> >>>> # suppress the irrelevant (wrong) warnings from the netty channel >>>> handler >>>> log4j.logger.org.jboss.netty.channel.DefaultChannelPipeline=ERROR,file >>>> >>>> >>>> *modified : *commented the above and added new logging from >>>> actual flink application log4.properties file >>>> >>>> #log4j.rootLogger=INFO,file >>>> >>>> # Log all infos in the given file >>>> #log4j.appender.file=org.apache.log4j.FileAppender >>>> #log4j.appender.file.file=${log.file} >>>> #log4j.appender.file.append=false >>>> #log4j.appender.file.layout=org.apache.log4j.PatternLayout >>>> #log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd >>>> HH:mm:ss,SSS} %-5p %-60c %x - %m%n >>>> >>>> # suppress the irrelevant (wrong) warnings from the netty channel >>>> handler >>>> #log4j.logger.org.jboss.netty.channel.DefaultChannelPipeline=ERROR,file >>>> >>>> # This affects logging for both user code and Flink >>>> rootLogger.level = INFO >>>> rootLogger.appenderRef.file.ref = MainAppender >>>> >>>> # Uncomment this if you want to _only_ change Flink's logging >>>> #logger.flink.name = org.apache.flink >>>> #logger.flink.level = INFO >>>> >>>> # The following lines keep the log level of common libraries/connectors >>>> on >>>> # log level INFO. The root logger does not override this. You have to >>>> manually >>>> # change the log levels here. >>>> logger.akka.name = akka >>>> logger.akka.level = INFO >>>> logger.kafka.name= org.apache.kafka >>>> logger.kafka.level = INFO >>>> logger.hadoop.name = org.apache.hadoop >>>> logger.hadoop.level = INFO >>>> logger.zookeeper.name = org.apache.zookeeper >>>> logger.zookeeper.level = INFO >>>> >>>> # Log all infos in the given file >>>> appender.main.name = MainAppender >>>> appender.main.type = File >>>> appender.main.append = false >>>> appender.main.fileName = ${sys:log.file} >>>> appender.main.layout.type = PatternLayout >>>> appender.main.layout.pattern = %d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c >>>> %x - %m%n >>>> >>>> # Suppress the irrelevant (wrong) warnings from the Netty channel >>>> handler >>>> logger.netty.name = >>>> org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline >>>> logger.netty.level = OFF >>>> >>>> ********************************** >>>> I also think its related to the log4j setting but I'm not able to >>>> figure it out. >>>> Please let me know if you want any other log files or configuration. >>>> >>>> Thanks. >>>> >>>> On Sun, Nov 1, 2020 at 10:06 PM Yang Wang <danrtsey...@gmail.com> >>>> wrote: >>>> >>>>> Hi Diwakar Jha, >>>>> >>>>> From the logs you have provided, everything seems working as expected. >>>>> The JobManager and TaskManager >>>>> java processes have been started with correct dynamic options, >>>>> especially for the logging. >>>>> >>>>> Could you share the content of $FLINK_HOME/conf/log4j.properties? I >>>>> think there's something wrong with the >>>>> log4j config file. For example, it is a log4j1 format. But we are >>>>> using log4j2 in Flink 1.11. >>>>> >>>>> >>>>> Best, >>>>> Yang >>>>> >>>>> Diwakar Jha <diwakar.n...@gmail.com> 于2020年11月2日周一 上午1:57写道: >>>>> >>>>>> Hi >>>>>> I'm running Flink 1.11 on EMR 6.1.0. I can see my job is running fine >>>>>> but i'm not seeing any taskmanager/jobmanager logs. >>>>>> I see the below error in stdout. >>>>>> 18:29:19.834 [flink-akka.actor.default-dispatcher-28] ERROR >>>>>> org.apache.flink.runtime.rest.handler.taskmanager. >>>>>> TaskManagerLogFileHandler - Failed to transfer file from TaskExecutor >>>>>> container_1604033334508_0001_01_000004. >>>>>> java.util.concurrent.CompletionException: org.apache.flink.util. >>>>>> FlinkException: The file LOG does not exist on the TaskExecutor. >>>>>> >>>>>> I'm stuck at this step for a couple of days now and not able to >>>>>> migrate to Flink 1.11. I would appreciate it if anyone can help me. >>>>>> i have the following setup : >>>>>> a) i'm deploying flink using yarn. I have attached yarn application >>>>>> id logs. >>>>>> c) stsd setup >>>>>> >>>>>> metrics.reporters: stsd >>>>>> metrics.reporter.stsd.factory.class: >>>>>> org.apache.flink.metrics.statsd.StatsDReporterFactory >>>>>> metrics.reporter.stsd.host: localhost >>>>>> metrics.reporter.stsd.port: 8125 >>>>>> >>>>>> > > >