Hi devs, I have been working on designing SPARK-28594 [1] (though I've started with this via different requests) and design doc is now available [2].
Let me describe SPARK-28954 briefly - single and growing event log file for application has been major issue for streaming application as as long as event log just grows while the application is running, and lots of issues occur from there. The only viable workaround has been disabling event log which is not easily acceptable. Maybe stopping the application and rerunning would be another approach but it sounds really odd to stop the application due to event log. SPARK-28594 enables the way to roll the event log files, with compacting old event log files without losing the ability to replay whole logs. While I'll break down issue into subtask and start from easier one, in parallel I'd like to ask for reviewing on the design to get better idea and find possible defects of design. Please note that the doc is intended to describe the detailed changes (closer to the implementation details) and is not a kind of SPIP because I wouldn't feel going through SPIP process for this improvement - the change would be rather not huge and the proposal works orthogonal to current feature. Please let me know if it's not the case and SPIP process is necessary. Thanks, Jungtaek Lim (HeartSaVioR) 1. https://issues.apache.org/jira/browse/SPARK-28594 2. https://docs.google.com/document/d/12bdCC4nA58uveRxpeo8k7kGOI2NRTXmXyBOweSi4YcY/edit?usp=sharing