Till Rohrmann created FLINK-6525: ------------------------------------ Summary: Transferred TM log/stdout files are never removed from BlobStore Key: FLINK-6525 URL: https://issues.apache.org/jira/browse/FLINK-6525 Project: Flink Issue Type: Bug Components: Distributed Coordination Affects Versions: 1.3.0, 1.4.0 Reporter: Till Rohrmann
The {{TaskManager}} uses the {{BlobClient}} to upload its stdout/log file to the {{BlobServer}}. If HA mode is enabled, then these files will also be uploaded to the {{BlobStore}}. Since the {{TaskManagerLogHandler}} only cleans up files from a TM in case it has already received another file from this TM and additionally does this in a non thread safe manner, it can easily happen that files won't get cleaned up from the {{BlobStore}}. I think we should not upload these kind of files to the persistent/HA {{BlobStore}}. We could do this by introducing a storage mode when uploading files to the {{BlobServer}} (e.g. {{HA_STORAGE}} vs. {{LOCAL_STORAGE}}). Additionally, we should also register a timeout for only locally stored files or at least store them under its {{JobID}} such that these files are also cleaned up once the job is being cleaned up. -- This message was sent by Atlassian JIRA (v6.3.15#6346)