On 2021/01/06 13:35, Arvid Heise wrote:
I was actually not thinking about concurrent checkpoints (and actually want to get rid of them once UC is established, since they are addressing the same thing).
I would give a yuge +1 to that. I don't see why we would need concurrent checkpoints in most cases. (Any case even?)
However, I have the impression that you think mostly in terms of tasks and I mostly think in terms of subtasks. I especially want to have proper support for bounded sources where one partition is much larger than the other partitions (might be in conjunction with unbounded sources such that checkpointing is plausible to begin with). Hence, most of the subtasks are finished with one struggler remaining. In this case, the barriers are inserted now only in the struggling source subtask and potentially in any running downstream subtask. As far as I have understood, this would require barriers to be inserted downstream leading to similar race conditions.
No, I'm also thinking in terms of subtasks when it comes to triggering. As long as a subtask has at least one upstream task we don't need to manually trigger that task. A task will know which of its inputs have finished, so it will take those out of the calculation that waits for barriers from all upstream tasks. In the case where only a single upstream source is remaining the barriers from that task will then trigger checkpointing at the downstream task.
I'm also concerned about the notion of a final checkpoint. What happens when this final checkpoint times out (checkpoint timeout > async timeout) or fails for a different reason? I'm currently more inclined to just let checkpoints work until the whole graph is completed (and thought this was the initial goal of the whole FLIP to being with). However, that would require subtasks to stay alive until they receive checkpiontCompleted callback (which is currently also not guaranteed)...
The idea is that the final checkpoint is whatever checkpoint succeeds in the end. When a task (and I mostly mean subtask when I say task) knows that it is done it waits for the next successful checkpoint and then shuts down.
This is a basic question, though: should we simply keep all tasks (subtasks) around forever until the whole graph shuts down? Our answer for this was *no*, so far. We would like to allow tasks to shut down, such that the resources are freed at that point.
Best, Aljoscha