HI Yang, I'm able to see taskmanage and jobmanager logs after I changed the log4j.properties file (/usr/lib/flink/conf). Thank you! I updated the file as shown below. I had to kill the app ( yarn application -kill <appid> ) and start flink job again to get the logs. This doesn't seem like an efficient way. I was wondering if there's a more simpler way to do it in production. let me know, please!
*Actual* log4j.rootLogger=INFO,file # Log all infos in the given file log4j.appender.file=org.apache.log4j.FileAppender log4j.appender.file.file=${log.file} log4j.appender.file.append=false log4j.appender.file.layout=org.apache.log4j.PatternLayout log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x - %m%n # suppress the irrelevant (wrong) warnings from the netty channel handler log4j.logger.org.jboss.netty.channel.DefaultChannelPipeline=ERROR,file *modified : *commented the above and added new logging from actual flink application log4.properties file #log4j.rootLogger=INFO,file # Log all infos in the given file #log4j.appender.file=org.apache.log4j.FileAppender #log4j.appender.file.file=${log.file} #log4j.appender.file.append=false #log4j.appender.file.layout=org.apache.log4j.PatternLayout #log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x - %m%n # suppress the irrelevant (wrong) warnings from the netty channel handler #log4j.logger.org.jboss.netty.channel.DefaultChannelPipeline=ERROR,file # This affects logging for both user code and Flink rootLogger.level = INFO rootLogger.appenderRef.file.ref = MainAppender # Uncomment this if you want to _only_ change Flink's logging #logger.flink.name = org.apache.flink #logger.flink.level = INFO # The following lines keep the log level of common libraries/connectors on # log level INFO. The root logger does not override this. You have to manually # change the log levels here. logger.akka.name = akka logger.akka.level = INFO logger.kafka.name= org.apache.kafka logger.kafka.level = INFO logger.hadoop.name = org.apache.hadoop logger.hadoop.level = INFO logger.zookeeper.name = org.apache.zookeeper logger.zookeeper.level = INFO # Log all infos in the given file appender.main.name = MainAppender appender.main.type = File appender.main.append = false appender.main.fileName = ${sys:log.file} appender.main.layout.type = PatternLayout appender.main.layout.pattern = %d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x - %m%n # Suppress the irrelevant (wrong) warnings from the Netty channel handler logger.netty.name = org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline logger.netty.level = OFF On Tue, Nov 3, 2020 at 4:56 AM Yang Wang <danrtsey...@gmail.com> wrote: > You could issue "ps -ef | grep container_id_for_some_tm". And then you > will find the > following java options about log4j. > > > -Dlog.file=/var/log/hadoop-yarn/containers/application_xx/container_xx/taskmanager.log > -Dlog4j.configuration=file:./log4j.properties > -Dlog4j.configurationFile=file:./log4j.properties > > Best, > Yang > > Diwakar Jha <diwakar.n...@gmail.com> 于2020年11月2日周一 下午11:37写道: > >> Sure. I will check that and get back to you. could you please share how >> to check java dynamic options? >> >> Best, >> Diwakar >> >> On Mon, Nov 2, 2020 at 1:33 AM Yang Wang <danrtsey...@gmail.com> wrote: >> >>> If you have already updated the log4j.properties, and it still could not >>> work, then I >>> suggest to log in the Yarn NodeManager machine and check the >>> log4j.properties >>> in the container workdir is correct. Also you could have a look at the >>> java dynamic >>> options are correctly set. >>> >>> I think it should work if the log4j.properties and java dynamic options >>> are set correctly. >>> >>> BTW, could you share the new yarn logs? >>> >>> Best, >>> Yang >>> >>> Diwakar Jha <diwakar.n...@gmail.com> 于2020年11月2日周一 下午4:32写道: >>> >>>> >>>> >>>> Hi Yang, >>>> >>>> Thank you so much for taking a look at the log files. I changed my >>>> log4j.properties. Below is the actual file that I got from EMR 6.1.0 >>>> distribution of flink 1.11. I observed that it is different from Flink 1.11 >>>> that i downloaded so i changed it. Still I didn't see any logs. >>>> >>>> *Actual* >>>> log4j.rootLogger=INFO,file >>>> >>>> # Log all infos in the given file >>>> log4j.appender.file=org.apache.log4j.FileAppender >>>> log4j.appender.file.file=${log.file} >>>> log4j.appender.file.append=false >>>> log4j.appender.file.layout=org.apache.log4j.PatternLayout >>>> log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd >>>> HH:mm:ss,SSS} %-5p %-60c %x - %m%n >>>> >>>> # suppress the irrelevant (wrong) warnings from the netty channel >>>> handler >>>> log4j.logger.org.jboss.netty.channel.DefaultChannelPipeline=ERROR,file >>>> >>>> >>>> *modified : *commented the above and added new logging from >>>> actual flink application log4.properties file >>>> >>>> #log4j.rootLogger=INFO,file >>>> >>>> # Log all infos in the given file >>>> #log4j.appender.file=org.apache.log4j.FileAppender >>>> #log4j.appender.file.file=${log.file} >>>> #log4j.appender.file.append=false >>>> #log4j.appender.file.layout=org.apache.log4j.PatternLayout >>>> #log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd >>>> HH:mm:ss,SSS} %-5p %-60c %x - %m%n >>>> >>>> # suppress the irrelevant (wrong) warnings from the netty channel >>>> handler >>>> #log4j.logger.org.jboss.netty.channel.DefaultChannelPipeline=ERROR,file >>>> >>>> # This affects logging for both user code and Flink >>>> rootLogger.level = INFO >>>> rootLogger.appenderRef.file.ref = MainAppender >>>> >>>> # Uncomment this if you want to _only_ change Flink's logging >>>> #logger.flink.name = org.apache.flink >>>> #logger.flink.level = INFO >>>> >>>> # The following lines keep the log level of common libraries/connectors >>>> on >>>> # log level INFO. The root logger does not override this. You have to >>>> manually >>>> # change the log levels here. >>>> logger.akka.name = akka >>>> logger.akka.level = INFO >>>> logger.kafka.name= org.apache.kafka >>>> logger.kafka.level = INFO >>>> logger.hadoop.name = org.apache.hadoop >>>> logger.hadoop.level = INFO >>>> logger.zookeeper.name = org.apache.zookeeper >>>> logger.zookeeper.level = INFO >>>> >>>> # Log all infos in the given file >>>> appender.main.name = MainAppender >>>> appender.main.type = File >>>> appender.main.append = false >>>> appender.main.fileName = ${sys:log.file} >>>> appender.main.layout.type = PatternLayout >>>> appender.main.layout.pattern = %d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c >>>> %x - %m%n >>>> >>>> # Suppress the irrelevant (wrong) warnings from the Netty channel >>>> handler >>>> logger.netty.name = >>>> org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline >>>> logger.netty.level = OFF >>>> >>>> ********************************** >>>> I also think its related to the log4j setting but I'm not able to >>>> figure it out. >>>> Please let me know if you want any other log files or configuration. >>>> >>>> Thanks. >>>> >>>> On Sun, Nov 1, 2020 at 10:06 PM Yang Wang <danrtsey...@gmail.com> >>>> wrote: >>>> >>>>> Hi Diwakar Jha, >>>>> >>>>> From the logs you have provided, everything seems working as expected. >>>>> The JobManager and TaskManager >>>>> java processes have been started with correct dynamic options, >>>>> especially for the logging. >>>>> >>>>> Could you share the content of $FLINK_HOME/conf/log4j.properties? I >>>>> think there's something wrong with the >>>>> log4j config file. For example, it is a log4j1 format. But we are >>>>> using log4j2 in Flink 1.11. >>>>> >>>>> >>>>> Best, >>>>> Yang >>>>> >>>>> Diwakar Jha <diwakar.n...@gmail.com> 于2020年11月2日周一 上午1:57写道: >>>>> >>>>>> Hi >>>>>> I'm running Flink 1.11 on EMR 6.1.0. I can see my job is running fine >>>>>> but i'm not seeing any taskmanager/jobmanager logs. >>>>>> I see the below error in stdout. >>>>>> 18:29:19.834 [flink-akka.actor.default-dispatcher-28] ERROR >>>>>> org.apache.flink.runtime.rest.handler.taskmanager. >>>>>> TaskManagerLogFileHandler - Failed to transfer file from TaskExecutor >>>>>> container_1604033334508_0001_01_000004. >>>>>> java.util.concurrent.CompletionException: org.apache.flink.util. >>>>>> FlinkException: The file LOG does not exist on the TaskExecutor. >>>>>> >>>>>> I'm stuck at this step for a couple of days now and not able to >>>>>> migrate to Flink 1.11. I would appreciate it if anyone can help me. >>>>>> i have the following setup : >>>>>> a) i'm deploying flink using yarn. I have attached yarn application >>>>>> id logs. >>>>>> c) stsd setup >>>>>> >>>>>> metrics.reporters: stsd >>>>>> metrics.reporter.stsd.factory.class: >>>>>> org.apache.flink.metrics.statsd.StatsDReporterFactory >>>>>> metrics.reporter.stsd.host: localhost >>>>>> metrics.reporter.stsd.port: 8125 >>>>>> >>>>>>