[ https://issues.apache.org/jira/browse/FLINK-29789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Flink Jira Bot updated FLINK-29789: ----------------------------------- Labels: pull-request-available stale-minor (was: pull-request-available) I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help the community manage its development. I see this issues has been marked as Minor but is unassigned and neither itself nor its Sub-Tasks have been updated for 180 days. I have gone ahead and marked it "stale-minor". If this ticket is still Minor, please either assign yourself or give an update. Afterwards, please remove the label or in 7 days the issue will be deprioritized. > Fix flaky tests in CheckpointCoordinatorTest > -------------------------------------------- > > Key: FLINK-29789 > URL: https://issues.apache.org/jira/browse/FLINK-29789 > Project: Flink > Issue Type: Bug > Reporter: Sopan Phaltankar > Priority: Minor > Labels: pull-request-available, stale-minor > > The test > org.apache.flink.runtime.checkpoint.CheckpointCoordinatorTest.testTriggerAndDeclineCheckpointComplex > is flaky and has the following failure: > Failures: > [ERROR] Failures: > [ERROR] > CheckpointCoordinatorTest.testTriggerAndDeclineCheckpointComplex:1054 > expected:<2> but was:<1> > I used the tool [NonDex|https://github.com/TestingResearchIllinois/NonDex] to > find this flaky test. > Command: mvn -pl flink-runtime edu.illinois:nondex-maven-plugun:1.1.2:nondex > -Dtest=org.apache.flink.runtime.checkpoint.CheckpointCoordinatorTest#testTriggerAndDeclineCheckpointComplex > I analyzed the assertion failure and found that checkpoint1Id and > checkpoint2Id are getting assigned by iterating over a HashMap. > As we know, iterator() returns elements in a random order > [(JavaDoc|https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html#entrySet--]) > and this might cause test failures for some orders. > Therefore, to remove this non-determinism, we would change HashMap to > LinkedHashMap. > On further analysis, it was found that the Map is getting initialized on line > 1894 of org.apache.flink.runtime.checkpoint.CheckpointCoordinator class. > After changing from HashMap to LinkedHashMap, the above test is passing > without any non-determinism. -- This message was sent by Atlassian Jira (v8.20.10#820010)