[ 
https://issues.apache.org/jira/browse/FLINK-38344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zhong reassigned FLINK-38344:
---------------------------------

    Assignee: RocMarshal

> The local files of the HistoryServer may risk never being deleted.
> ------------------------------------------------------------------
>
>                 Key: FLINK-38344
>                 URL: https://issues.apache.org/jira/browse/FLINK-38344
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Web Frontend
>    Affects Versions: 2.0.0, 2.1.0, 2.2.0, 2.1.1
>            Reporter: RocMarshal
>            Assignee: RocMarshal
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 2.2.0, 2.1.1, 1.20.4
>
>         Attachments: image-2025-09-11-00-31-26-595.png, 
> image-2025-09-11-00-32-25-793.png, image-2025-09-11-00-34-54-580.png
>
>
> When the {{historyserver.web.tmpdir }}configuration points to a non-system 
> temporary directory, the contents of this directory will only be cleaned up 
> if explicitly deleted.
> Under the current cleanup logic, this directory is cleared in the following 
> two scenarios:
> 1.
> ​{*}​When the HistoryServer encounters an exception​{*}​, it actively cleans 
> up this directory. However, if the HistoryServer process is forcibly 
> terminated externally, this cleanup logic will not be triggered.
> !image-2025-09-11-00-31-26-595.png!
>  
> !image-2025-09-11-00-32-25-793.png!
>  
> 2.
> ​{*}​The {{{}HistoryServerArchiveFetcher{}}}​{*}​ builds 
> {{{}cachedArchivesPerRefreshDirectory{}}}based on the job information still 
> present in the remote directory and uses this to determine which local job 
> files need cleanup. Consequently, if the HistoryServer retains a large number 
> of local job files that no longer exist in remote storage, these files will 
> never be deleted. This may lead to excessive file handle usage on the local 
> node, resulting in file descriptor leaks.
> !image-2025-09-11-00-34-54-580.png!
>  
>  
>  
>  
> A relatively straightforward fix would be:
> In the HistoryServer constructor, first clear all files in the 
> {{{}historyserver.web.tmpdir{}}}directory before proceeding with the original 
> initialization logic. This ensures that the local files marked for 
> cleanup—based on 
> {{{}HistoryServerArchiveFetcher#cachedArchivesPerRefreshDirectory{}}}—are 
> free from leaks.
> I'd like to fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to