zhijiangW commented on issue #11687: [FLINK-16536][network][checkpointing] 
Implement InputChannel state recovery for unaligned checkpoint
URL: https://github.com/apache/flink/pull/11687#issuecomment-612792024
 
 
   Considering another possible option to implement the input recovery, I 
thought through a bit for the proposal of introducing a parallel 
`StreamTaskSpilledInput` /`SpilledInputGate`/`SpilledInputChannel `. If we want 
to further support the new unaligned checkpoint progress during recovery, that 
means two separate components should run in parallel future. Then it seems 
harder to control and bring more component changes if we work on more 
high-level components.
   
   Regarding the lowest-level component `SpilledInputChannel`, I also think it 
should be more complex if running in parallel with respective 
`RemoteInputChannel` future for below concerns.
   
   - `SingleInputGate` would maintain two suits of input channel collections. 
The normal remote/local channel collection can only receive events like 
`CheckpointBarrier`, `EndOfPartition`  during recovery, and the other 
`SpilledInputChannel` collection can read state. We might need to adjust the 
queue `inputChannelsWithData` in gate to control the priority of data from 
different channel collection.
   
   - How to handle the exclusive/floating buffer distribution inside 
`SpilledInputChannel`? There might be two options. If we want to recovery only 
one channel state at a time by one thread, then we can give all the available 
buffers(regardless of exclusive/floating) to this channel. And after finishing 
this channel, we can move forward to the next channel with all available 
buffers. If we want o recovery multiple channels in parallel with different 
threads(as this PR did now), we might probably back to the existing mechanisms 
as in `RemoteInputChannel` , then we need to consider the code duplication 
issue.
   
   - We still need to adjust the internal logic inside `RemoteInputChannel` to 
do so. E.g. when it receives the barrier from network, it should not request 
floating buffers based on backlog, along with the related logics with exclusive 
buffer and floating buffers, until all the `SpilledInputChannel` finish. 
   
   - When all the `SpilledInputChannel` finish, it should notify the 
`RemoteInputChannel` restore the normal exclusive/floating buffers logic. The 
interaction is needed among different channels via `SingleInputGate`.
   
   In conclusion, the essential difference with current PR implementation is 
whether to extract the logic `readChannelState` out of `RemoteInputChannel` to 
relief its load. In contrast, this logic is introduced into new 
`SpilledInputChannel` which would bring extra complexity in `SingleInputGate` 
component and the trouble for reusing the existing buffer distribution/request 
logic.
   
   So I a bit prefer the current way in PR at-least for MVP now until we fully 
consider well for everything future.
   But we can make some partial changes to control the channel state process on 
`SingleInputGate` component. E.g. introduce a separate thread pool to execute 
the channel read instead of reusing netty threads as above discussed, and 
control the channel read one by one instead of concurrently if have the 
requirement. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to