Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1083#issuecomment-138861709 I don't think this really works. It violates so many assumptions, like the fact that memory is available after a job ends. The accounting for that depends on task slots - a free slot must have the necessary memory. When you call collect twice, the whole program get re-executed and some parts are simply discarded for the sake of using the persisted data. That makes sure the results are the same, but does not safe any work. For that, you need backtracking through the graph to check whether results are already available. Which brings us to the point: Making the system's level of data streams aware of this (both on TaskManager and JobManager side) solves two issues with one approach: Caching for reuse in incrementally constructed programs, and caching for recovery. Both cases simply need persistent streams and backtracking in the execution graph. Sorry that you made this effort and it cannot be merged now, but I do not understand why you keep ignoring the discussions, inputs, and the request to plan and describe such highly involved features before writing them.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---