[ https://issues.apache.org/jira/browse/FLINK-38344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wei Zhong reassigned FLINK-38344: --------------------------------- Assignee: RocMarshal > The local files of the HistoryServer may risk never being deleted. > ------------------------------------------------------------------ > > Key: FLINK-38344 > URL: https://issues.apache.org/jira/browse/FLINK-38344 > Project: Flink > Issue Type: Bug > Components: Runtime / Web Frontend > Affects Versions: 2.0.0, 2.1.0, 2.2.0, 2.1.1 > Reporter: RocMarshal > Assignee: RocMarshal > Priority: Minor > Labels: pull-request-available > Fix For: 2.2.0, 2.1.1, 1.20.4 > > Attachments: image-2025-09-11-00-31-26-595.png, > image-2025-09-11-00-32-25-793.png, image-2025-09-11-00-34-54-580.png > > > When the {{historyserver.web.tmpdir }}configuration points to a non-system > temporary directory, the contents of this directory will only be cleaned up > if explicitly deleted. > Under the current cleanup logic, this directory is cleared in the following > two scenarios: > 1. > {*}When the HistoryServer encounters an exception{*}, it actively cleans > up this directory. However, if the HistoryServer process is forcibly > terminated externally, this cleanup logic will not be triggered. > !image-2025-09-11-00-31-26-595.png! > > !image-2025-09-11-00-32-25-793.png! > > 2. > {*}The {{{}HistoryServerArchiveFetcher{}}}{*} builds > {{{}cachedArchivesPerRefreshDirectory{}}}based on the job information still > present in the remote directory and uses this to determine which local job > files need cleanup. Consequently, if the HistoryServer retains a large number > of local job files that no longer exist in remote storage, these files will > never be deleted. This may lead to excessive file handle usage on the local > node, resulting in file descriptor leaks. > !image-2025-09-11-00-34-54-580.png! > > > > > A relatively straightforward fix would be: > In the HistoryServer constructor, first clear all files in the > {{{}historyserver.web.tmpdir{}}}directory before proceeding with the original > initialization logic. This ensures that the local files marked for > cleanup—based on > {{{}HistoryServerArchiveFetcher#cachedArchivesPerRefreshDirectory{}}}—are > free from leaks. > I'd like to fix it. -- This message was sent by Atlassian Jira (v8.20.10#820010)