azagrebin opened a new pull request #14301: URL: https://github.com/apache/flink/pull/14301
Currently, there is no easy way to test how JM failover (revoke and grant leadership) affects other features with the MiniCluster and its testing resource rule. The custom HA services can be provided to the `TestingMiniCluster` but there is no simple HA services to support revoking and granting leadership with a valid in-memory checkpoint store. Providing a way to enable such embedded HA services for the `MiniCluster` out of the box allows to implement IT cases similar to E2E tests. This PR modifies `TestingCheckpointRecoveryFactory` to create and keep checkpoint store and counter per job to share the factory in cluster for multiple jobs. Uses in-memory `RecoverableCompletedCheckpointStore` (renamed to `EmbeddedCompletedCheckpointStore`) in `TestingEmbeddedHaServices` (renamed to `EmbeddedHaServicesWithLeadershipControl`). This allows using `EmbeddedHaServicesWithLeadershipControl` to revoke and grant leadership with valid checkpoint store. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org