With the positive feedback from Mridul and Wenchen, I will officially start the vote.
On Tue, Nov 15, 2022 at 8:57 PM Wenchen Fan <cloud0...@gmail.com> wrote: > This looks great! UI stability/scalability has been a pain point for a > long time. > > On Sat, Nov 12, 2022 at 5:24 AM Gengliang Wang <ltn...@gmail.com> wrote: > >> Hi Everyone, >> >> I want to discuss the "Better Spark UI scalability and Driver stability >> for large applications" proposal. Please find the links below: >> >> *JIRA* - https://issues.apache.org/jira/browse/SPARK-41053 >> *SPIP Document* - >> https://docs.google.com/document/d/1cuKnFwlTodyVhUQPMuakq2YDaLH05jaY9FRu_aD1zMo/edit?usp=sharing >> >> *Excerpt from the document: * >> >> After SPARK-18085 <https://issues.apache.org/jira/browse/SPARK-18085>, >> the Spark history server(SHS) becomes more scalable for processing large >> applications by supporting a persistent KV-store(LevelDB/RocksDB) as the >> storage layer. >> >> As for the live Spark UI, all the data is still stored in memory, which >> can bring memory pressures to the Spark driver for large applications. >> >> For better Spark UI scalability and Driver stability, I propose to >> >> - >> >> Support storing all the UI data in a persistent KV store. >> RocksDB/LevelDB provides low memory overhead. Their write/read performance >> is fast enough to serve the workloads of live UI. Spark UI can retain more >> data with the new backend, while SHS can leverage it to fasten its >> startup. >> - Support a new Protobuf serializer for all the UI data. The new >> serializer is supposed to be faster, according to benchmarks. It will be >> the default serializer for the persistent KV store of live UI. >> >> >> >> >> I appreciate any suggestions you can provide, >> Gengliang >> >