Hi Yang, Thanks for your advice, now I have a good reason to upgrade to 1.11.
Regards, Maxim. On Tue, Aug 4, 2020 at 9:39 AM Yang Wang <danrtsey...@gmail.com> wrote: > AFAIK, there is no way to roll the *.out/err files except we hijack the > stdout/stderr in Flink code. However, it is a temporary hack. > > A good way is to write your logs to other separate files that could roll > via log4j. If you want to access them in the Flink webUI, > upgrade to the 1.11 version. Then you will find a "Log List" tab under > JobManager sidebar. > > > Best, > Yang > > Maxim Parkachov <lazy.gop...@gmail.com> 于2020年8月4日周二 下午2:52写道: > >> Hi Yang, >> >> you are right. Since then, I looked for open files and found *.out/*.err >> files on that partition and as you mentioned they don't roll. >> I could implement a workaround to restart the streaming job every week or >> so, but I really don't want to go this way. >> >> I tried to forward logs to files and then I could roll them, but then I >> don't see logs in the GUI. >> >> So my question would be, how to make them roll ? >> >> Regards, >> Maxim. >> >> On Tue, Aug 4, 2020 at 4:48 AM Yang Wang <danrtsey...@gmail.com> wrote: >> >>> Hi Maxim, >>> >>> First, i want to confirm with you that do you have checked all the >>> "yarn.nodemanager.log-dirs". If you >>> could access the logs in Flink webUI, the log files(e.g. >>> taskmanager.log, taskmanager.out, taskmanager.err) >>> should exist. I suggest to double check the multiple log-dirs. >>> >>> Since the *.out/err files do not roll, if you print some user logs to >>> the stdout/stderr, the two files will increase >>> over time. >>> >>> When you stop the Flink application, Yarn will clean up all the jars and >>> logs, so you find that the disk space get back. >>> >>> >>> Best, >>> Yang >>> >>> Maxim Parkachov <lazy.gop...@gmail.com> 于2020年7月30日周四 下午10:00写道: >>> >>>> Hi everyone, >>>> >>>> I have a strange issue with flink logging. I use pretty much standard >>>> log4 config, which is writing to standard output in order to see it in >>>> Flink GUI. Deployment is on YARN with job mode. I can see logs in UI, no >>>> problem. On the servers, where Flink YARN containers are running, there is >>>> disk quota on the partition where YARN normally creates logs. I see no >>>> specific files in the application_xx directory, but space on the disk is >>>> actually decreasing with time. After several weeks we eventually hit quota. >>>> It seems like some file or pipe is created but not closed, but still >>>> reserves the space. After I restart Flink job, space is >>>> immediately returned back. I'm sure that flink job is the problem, I have >>>> re-produces issue on a cluster where only 1 filnk job was running. Below is >>>> my log4 config. Any help or idea is appreciated. >>>> >>>> Thanks in advance, >>>> Maxim. >>>> ------------------------------------------- >>>> # This affects logging for both user code and Flink >>>> log4j.rootLogger=INFO, file, stderr >>>> >>>> # Uncomment this if you want to _only_ change Flink's logging >>>> #log4j.logger.org.apache.flink=INFO >>>> >>>> # The following lines keep the log level of common libraries/connectors >>>> on >>>> # log level INFO. The root logger does not override this. You have to >>>> manually >>>> # change the log levels here. >>>> log4j.logger.akka=INFO >>>> log4j.logger.org.apache.kafka=INFO >>>> log4j.logger.org.apache.hadoop=INFO >>>> log4j.logger.org.apache.zookeeper=INFO >>>> >>>> # Log all infos in the given file >>>> log4j.appender.file=org.apache.log4j.FileAppender >>>> log4j.appender.file.file=${log.file} >>>> log4j.appender.file.append=false >>>> log4j.appender.file.layout=org.apache.log4j.PatternLayout >>>> log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd >>>> HH:mm:ss,SSS} %-5p %-60c %x - %m%n >>>> >>>> # Suppress the irrelevant (wrong) warnings from the Netty channel >>>> handler >>>> log4j.logger.org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline=ERROR, >>>> file >>>> >>>>