Sorry for the late response. So many FLIPs these days. I am a bit unsure about the motivation here, and that this need to be a part of Flink. It sounds like this can be perfectly built around Flink as a minimal library on top of it, without any change in the core APIs or runtime.
The proposal to handle "caching intermediate results" (to make them reusable across jobs in a session), and "writing them in different formats / indexing them" doesn't sound like it should be the same mechanism. - The caching part is a transparent low-level primitive. It avoid re-executing a part of the job graph, but otherwise is completely transparent to the consumer job. - Writing data out in a sink, compressing/indexing it and then reading it in another job is also a way of reusing a previous result, but on a completely different abstraction level. It is not the same intermediate result any more. When the consumer reads from it and applies predicate pushdown, etc. then the consumer job looks completely different from a job that consumed the original result. It hence needs to be solved on the API level via a sink and a source. I would suggest to keep these concepts separate: Caching (possibly automatically) for jobs in a session, and long term writing/sharing of data sets. Solving the "long term writing/sharing" in a library rather than in the runtime also has the advantage of not pushing yet more stuff into Flink's core, which I believe is also an important criterion. Best, Stephan On Thu, Jul 25, 2019 at 4:53 AM Xuannan Su <suxuanna...@gmail.com> wrote: > Hi folks, > > I would like to start the FLIP discussion thread about the pluggable > intermediate result storage. > > This is phase 2 of FLIP-36: Support Interactive Programming in Flink Skip > to end of metadata. While the FLIP-36 provides a default implementation of > the intermediate result storage using the shuffle service, we would like to > make the intermediate result storage pluggable so that the user can easily > swap the storage. > > We are looking forward to your thought! > > The FLIP link is the following: > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-48%3A+Pluggable+Intermediate+Result+Storage > < > https://cwiki.apache.org/confluence/display/FLINK/FLIP-48:+Pluggable+Intermediate+Result+Storage > > > > Best, > Xuannan >