Hi all,

We are developing several batch processing applications using the DataSet API 
of the Apache Flink.
For the time being, we are facing an issue with one of our production 
environments since its disk usage increase enormously. After a quick 
investigation, we concluded that the /tmp/flink-io-{} directory (under the 
parent directory of the Apache Flink deployment) contains files of more than 
1TB and we need to regularly delete them in order to return our system to its 
proper functionality. On the first sight, there is no significant impact when 
deleting these temp files. So, I need your help to answer the following 
questions:

  *   What kind of data does it stored to the aforementioned directory?
  *   Why does the respective files have such an enormous size?
  *   How can we limit the size of the data written to the respective directory?
  *   Is there any way  to delete such files automatically when not needed yet?

Thanks in advance for your help,
Konstantinos

Reply via email to