Re: [DISCUSS] FLIP-XXX: Support Pluggable Storage Backend for HistoryServer

zihao chen Tue, 09 Jun 2026 23:06:36 -0700

Bumping this thread. Thanks!

Best regards,
Zihao


zihao chen <[email protected]> 于2026年5月23日周六 18:39写道：

> Hi all,
>
> Thanks everyone for the valuable feedback and discussions on this FLIP.
>
> Based on the discussion so far, the proposal has received generally
> positive
> feedback, and several important points have been clarified, including:
>
>    - ArchiveStorage API design considerations
>    - RocksDB deployment model and isolation between HistoryServer
>    instances
>    - Cleanup and retention strategy compatibility with existing mechanisms
>
>
> Besides, the earlier related discussion can be found here:
> https://lists.apache.org/thread/6thlq9c5twyvzmcw7q24nm4q0rcbz5qp
>
> If there are no further major concerns, I’m planning to start the VOTE
> thread
> next Tuesday.
>
> Please feel free to share any additional feedback before then.
>
> Best regards,
> Zihao
>
> zihao chen <[email protected]> 于2026年5月19日周二 21:05写道：
>
>> Hi Zuo,
>>
>> Thanks for your feedback and for aligning in this direction.
>>
>> Here are the clarifications regarding your questions:
>>
>>    - *RocksDB Deployment*:
>>
>> RocksDB instance is coupled with the HistoryServer instance (each
>> instance has its own independent local RocksDB). There is no shared
>> access between multiple HistoryServer instances.
>>
>>
>>    - *Cleanup Strategy*:
>>
>> The core cleanup still relies on the original ArchiveRetainedStrategy
>> (max
>> job counts, TTL, etc.). While we've also implemented a
>> disk-capacity-based
>> cleanup strategy in our internal practice to prevent disk exhaustion,
>> this feature is relatively independent. I  decouple it for now and
>> discuss it
>> further in a follow-up FLIP.
>>
>>
>> Let me know if this looks good to you!
>>
>>
>> Best regards,
>>
>> Zihao
>>
>>
>> 魏祚 <[email protected]> 于2026年5月19日周二 17:33写道：
>>
>>>
>>>
>>> Hi Zihao,
>>>
>>>
>>> Thanks for your proposal. The excessive small files problem of
>>> HistoryServer is indeed a real pain point in large-scale production
>>> environments, and introducing RocksDB is a great idea.
>>> There's a few details I'd like to clarify:
>>> What is the deployment strategy for RocksDB? Is there a scenario where
>>> multiple HistoryServer instances share and access the same RocksDB
>>> instance? If so, are there any potential compatibility or concurrency risks?
>>> After introducing RocksDB, what is the strategy for cleaning up
>>> historical garbage files and expired job archives?
>>>
>>>
>>> Best regards,
>>> Zuo Wei
>>>
>>>
>>> ----- Original Message -----
>>> From: "zihao chen" <[email protected]>
>>> To: [email protected]
>>> Sent: Sat, 9 May 2026 11:37:08 +0800
>>> Subject: [DISCUSS] FLIP-XXX: Support Pluggable Storage Backend for
>>> HistoryServer
>>>
>>> Hi all,
>>>
>>> I’d like to start a discussion on FLIP-XXX:
>>>
>>> *Support Pluggable Storage Backend forHistoryServer*.
>>>
>>> This FLIP proposes improving the HistoryServer
>>> to address excessive *small files* when handling
>>> large numbers of archived jobs.
>>>
>>> [Proposal]
>>> Optional *RocksDB-based storage* to reduce
>>> small files
>>>
>>> [Compatibility]
>>> Full backward compatibility (FILE as default)
>>>
>>> The detailed design is described in the
>>> FLIP document:
>>>
>>>
>>> https://docs.google.com/document/d/1idHu5bq0GOsUuUAEIJSJ2UuekcDjbW0tHLNbsQfugDg/edit?usp=sharing
>>>
>>> This FLIP is split from the earlier discussion [1].
>>>
>>> Looking forward to your feedback.
>>>
>>> [1] https://lists.apache.org/thread/6thlq9c5twyvzmcw7q24nm4q0rcbz5qp
>>>
>>>
>>> Best regards,
>>>
>>> Zihao Chen
>>>
>>

Re: [DISCUSS] FLIP-XXX: Support Pluggable Storage Backend for HistoryServer

Reply via email to