Hi Yang,

you are right. Since then, I looked for open files and found *.out/*.err
files on that partition and as you mentioned they don't roll.
I could implement a workaround to restart the streaming job every week or
so, but I really don't want to go this way.

I tried to forward logs to files and then I could roll them, but then I
don't see logs in the GUI.

So my question would be, how to make them roll ?

Regards,
Maxim.

On Tue, Aug 4, 2020 at 4:48 AM Yang Wang <danrtsey...@gmail.com> wrote:

> Hi Maxim,
>
> First, i want to confirm with you that do you have checked all the
> "yarn.nodemanager.log-dirs". If you
> could access the logs in Flink webUI, the log files(e.g. taskmanager.log,
> taskmanager.out, taskmanager.err)
> should exist. I suggest to double check the multiple log-dirs.
>
> Since the *.out/err files do not roll, if you print some user logs to the
> stdout/stderr, the two files will increase
> over time.
>
> When you stop the Flink application, Yarn will clean up all the jars and
> logs, so you find that the disk space get back.
>
>
> Best,
> Yang
>
> Maxim Parkachov <lazy.gop...@gmail.com> 于2020年7月30日周四 下午10:00写道:
>
>> Hi everyone,
>>
>> I have a strange issue with flink logging. I use pretty much standard
>> log4 config, which is writing to standard output in order to see it in
>> Flink GUI. Deployment is on YARN with job mode. I can see logs in UI, no
>> problem. On the servers, where Flink YARN containers are running, there is
>> disk quota on the partition where YARN normally creates logs. I see no
>> specific files in the application_xx directory, but space on the disk is
>> actually decreasing with time. After several weeks we eventually hit quota.
>> It seems like some file or pipe is created but not closed, but still
>> reserves the space. After I restart Flink job, space is
>> immediately returned back. I'm sure that flink job is the problem, I have
>> re-produces issue on a cluster where only 1 filnk job was running. Below is
>> my log4 config. Any help or idea is appreciated.
>>
>> Thanks in advance,
>> Maxim.
>> -------------------------------------------
>> # This affects logging for both user code and Flink
>> log4j.rootLogger=INFO, file, stderr
>>
>> # Uncomment this if you want to _only_ change Flink's logging
>> #log4j.logger.org.apache.flink=INFO
>>
>> # The following lines keep the log level of common libraries/connectors on
>> # log level INFO. The root logger does not override this. You have to
>> manually
>> # change the log levels here.
>> log4j.logger.akka=INFO
>> log4j.logger.org.apache.kafka=INFO
>> log4j.logger.org.apache.hadoop=INFO
>> log4j.logger.org.apache.zookeeper=INFO
>>
>> # Log all infos in the given file
>> log4j.appender.file=org.apache.log4j.FileAppender
>> log4j.appender.file.file=${log.file}
>> log4j.appender.file.append=false
>> log4j.appender.file.layout=org.apache.log4j.PatternLayout
>> log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss,SSS}
>> %-5p %-60c %x - %m%n
>>
>> # Suppress the irrelevant (wrong) warnings from the Netty channel handler
>> log4j.logger.org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline=ERROR,
>> file
>>
>>

Reply via email to