Arvid Heise created FLINK-19520:
-----------------------------------

             Summary: Add reliable test randomization for checkpointing
                 Key: FLINK-19520
                 URL: https://issues.apache.org/jira/browse/FLINK-19520
             Project: Flink
          Issue Type: Test
          Components: Runtime / Configuration
    Affects Versions: 1.12.0
            Reporter: Arvid Heise
            Assignee: Arvid Heise


With the larger refactoring of checkpoint alignment and the additional of more 
unaligned checkpoint settings, it becomes increasingly important to provide a 
large test coverage.

Unfortunately, adding sufficient test cases in a test matrix appears to be 
unrealistic: many of the encountered issues were subtle, sometimes caused by 
race conditions or unusual test configurations and often only visible in e2e 
tests.

Hence, we like to rely on all existing Flink tests to provide a sufficient 
coverage for checkpointing. However, as more and more options in unaligned 
checkpoint are going to be implemented in this and the upcoming release, 
running all Flink tests - especially e2e - in a test matrix is prohibitively 
expensive, even for nightly builds.

Thus, we want to introduce test randomization for all tests that do not use a 
specific checkpointing mode. In a similar way, we switched from aligned 
checkpoints by default in tests to unaligned checkpoint during the last release 
cycle.

To not burden the developers of other components too much, we set the following 
requirements:
 * Randomization should be seeded in a way that both builds on Azure pipelines 
and local builds will result in the same settings to ease debugging and ensure 
reproducibility.
 * Randomized options should be shown in the test log.
 * Execution order of test cases will not influence the randomization.
 * Randomization is hidden, no change on any test is needed.
 * Randomization only happens during local/remote test execution. User 
deployments are not affected.
 * Test developers are able to avoid randomization by explicitly providing 
checkpoint configs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to