Myasuka commented on issue #7009: [FLINK-10712] Support to restore state when using RestartPipelinedRegionStrategy URL: https://github.com/apache/flink/pull/7009#issuecomment-454098941 @StefanRRichter Thanks for your explanation. I still have two questions below: 1. Even if we could assign partitioned operator state to all operator instances, the current `taskRestore` within `Execution` could only be shipped to taskmanagers if those executions located in the failed region. And instances that we keep running would not know operator states have changed. The possible bug "`This can lead to some partitions beeing assigned twice or not being assigned at all`" you mentioned is more likely on execution-graph side, and how we define the 'disunity' among tasks? 1. The last suggestion you provide seems a bit confused for me, please correct me if I am wrong, did you actually mean only round-robin assign operator state if parallelism **did change**? The parallelism could only be changed if job restarted, while it would not change during job fail-over.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services