Hi Ahmad

We mainly recommend our user to set the checkpoint interval as three minutes.
If you don't rely on the keyed state to persistence, you could also disable 
checkpoint and let the kafka client to commit offset automatically [1] which 
might the most light-weight solution.


[1] 
https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html#kafka-consumers-offset-committing-behaviour-configuration

Best
Yun Tang
________________________________
From: Ahmad Hassan <[email protected]>
Sent: Tuesday, January 28, 2020 17:43
To: user <[email protected]>
Subject: Re: Flink RocksDB logs filling up disk space

Hi Yun,

Thank you for pointing that out. In our production landscapes with live 
customers, we have 10 second checkpoint interval and 7MB of average checkpoint 
size. We do incremental checkpoints. If we keep the checkpoint interval longer 
(i.e. 1 minute) then the kafka consumer lag starts increasing. The reason is 
that over the period of 1 minute, the checkpoint size grows and the job takes 
long time to do the checkpoint and as a result kafka consumer lag for our live 
traffic goes high. In order to keep checkpoint size small, we tried 10 second 
option which is working out well and our kafka lag never exceeds beyond 20 
messages on average. But i agree with you that 10 second option does not feel 
right and is too frequent in my opinion.

Do you have any recommendations for checkpointing interval please ?

Best Regards,


On Tue, 28 Jan 2020 at 07:46, Yun Tang 
<[email protected]<mailto:[email protected]>> wrote:
Hi Ahmad

Apart from setting the logger level of RocksDB, I also wonder why you would 
meet rocksdb checkpoint IO logs were filling up disk space very very quickly. 
How larger the local checkpoint state is and how long the checkpoint interval 
is? I think you might give a too short interval of checkpoint, even you could 
avoid to record too many logs, and I don't think current checkpoint 
configuration is appropriate.

Best
Yun Tang
________________________________
From: Ahmad Hassan <[email protected]<mailto:[email protected]>>
Sent: Monday, January 27, 2020 20:22
To: user <[email protected]<mailto:[email protected]>>
Subject: Re: Flink RocksDB logs filling up disk space


Thanks Chesnay!

On Mon, 27 Jan 2020 at 11:29, Chesnay Schepler 
<[email protected]<mailto:[email protected]>> wrote:
Please see https://issues.apache.org/jira/browse/FLINK-15068

On 27/01/2020 12:22, Ahmad Hassan wrote:
Hello,

In our production systems, we see that flink rocksdb checkpoint IO logs are 
filling up disk space very very quickly in the order of GB's as the logging is 
very verbose. How do we disable or suppress these logs please ? The rocksdb 
file checkpoint.cc is dumping huge amount of checkpoint logs like

Log(db_options.info_log, "Hard Linking %s", src_fname.c_str());


Best Regards,


Reply via email to