zhijiangW commented on issue #11687: [FLINK-16536][network][checkpointing] Implement InputChannel state recovery for unaligned checkpoint URL: https://github.com/apache/flink/pull/11687#issuecomment-612792024 Considering another possible option to implement the input recovery, I thought through a bit for the proposal of introducing a parallel `StreamTaskSpilledInput` /`SpilledInputGate`/`SpilledInputChannel `. If we want to further support the new unaligned checkpoint progress during recovery, that means two separate components should run in parallel future. Then it seems harder to control and bring more component changes if we work on more high-level components. Regarding the lowest-level component `SpilledInputChannel`, I also think it should be more complex if running in parallel with respective `RemoteInputChannel` future for below concerns. - `SingleInputGate` would maintain two suits of input channel collections. The normal remote/local channel collection can only receive events like `CheckpointBarrier`, `EndOfPartition` during recovery, and the other `SpilledInputChannel` collection can read state. We might need to adjust the queue `inputChannelsWithData` in gate to control the priority of data from different channel collection. - How to handle the exclusive/floating buffer distribution inside `SpilledInputChannel`? There might be two options. If we want to recovery only one channel state at a time by one thread, then we can give all the available buffers(regardless of exclusive/floating) to this channel. And after finishing this channel, we can move forward to the next channel with all available buffers. If we want o recovery multiple channels in parallel with different threads(as this PR did now), we might probably back to the existing mechanisms as in `RemoteInputChannel` , then we need to consider the code duplication issue. - We still need to adjust the internal logic inside `RemoteInputChannel` to do so. E.g. when it receives the barrier from network, it should not request floating buffers based on backlog, along with the related logics with exclusive buffer and floating buffers, until all the `SpilledInputChannel` finish. - When all the `SpilledInputChannel` finish, it should notify the `RemoteInputChannel` restore the normal exclusive/floating buffers logic. The interaction is needed among different channels via `SingleInputGate`. In conclusion, the essential difference with current PR implementation is whether to extract the logic `readChannelState` out of `RemoteInputChannel` to relief its load. In contrast, this logic is introduced into new `SpilledInputChannel` which would bring extra complexity in `SingleInputGate` component and the trouble for reusing the existing buffer distribution/request logic. So I a bit prefer the current way in PR at-least for MVP now until we fully consider well for everything future. But we can make some partial changes to control the channel state process on `SingleInputGate` component. E.g. introduce a separate thread pool to execute the channel read instead of reusing netty threads as above discussed, and control the channel read one by one instead of concurrently if have the requirement.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services