Hi, Allison. Thanks for driving this FLIP.
I have some questions to confirm:

1. I can’t find any existed configuration name
`historyserver.archive.cached-retained-jobs`, I guess that what you mean is
modifing existing configuration from `historyserver.archive.retained-jobs`
to `historyserver.archive.cached-retained-jobs`. If so, If we only limit
the number of retained-jobs stored locally, is the number of retained-jobs
stored remotely infinite?
2. I think it would be better to provide instructions for adding default
values to HistoryServerOptions.
3. Does `historyserver.archive.fs.refresh-interval` apply to both local and
remote storage simultaneously?

Best,
Yanquan

Allison <achang5...@gmail.com> 于 2025年1月17日周五 上午8:07写道:

> Hi everyone,
>
> I would like to initiate a discussion for the FLIP below, which enhances to
> the Flink History Server to allow greater scalability of the service.
>
> Motivation:
>
> Currently, the Flink History Server (FHS) is limited in the number of job
> archives it can serve based on the storage capacity of the node that the
> FHS runs in. Job archives are stored locally in a cache which creates a
> local directory which is expanded out based on the contents of a single
> json archive file. This not only uses up local memory space, but also
> because of how the FHS expands the job archives into a nested directory
> structure, for jobs with a large number of taskmanagers or subtasks, inode
> space often runs out.  In order to make the FHS more performant, we would
> like to introduce the ability to decouple the job archive storage for the
> FHS from being limited to the local cache, to being able to store and fetch
> jobs archives from a remote file store.
>
> FLIP proposal document:
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP+505%3A+Flink+History+Server+Scability+Improvements%2C+Remote+Data+Store+Fetch+and+Per+Job+Fetch
>
> Thanks!
>
> Best,
> - Allison Chang
>

Reply via email to