nikunjagarwal321 opened a new pull request, #25471: URL: https://github.com/apache/flink/pull/25471
## What is the purpose of the change [NonDex](https://github.com/TestingResearchIllinois/NonDex) is a tool for detecting and debugging wrong assumptions on under-determined Java APIs. While running the test cases using NonDex, flaky tests were found in the following classes : - org.apache.flink.table.planner.plan.batch.sql.PartitionableSourceTest - org.apache.flink.table.planner.plan.rules.logical.PushPartitionIntoTableSourceScanRuleTest The flaky tests can be found when running the following command: mvn edu.illinois:nondex-maven-plugin:2.1.7:nondex -Dtest={test} Sample Error : ``` PartitionableSourceTest.testUnconvertedExpression:136 optimized exec plan ==> expected: < Calc(select=[id, name, part1, part2, (part2 + 1) AS virtualField]) +- TableSourceScan(table=[[default_catalog, default_database, PartitionableTable, partitions=[{part1=A, part2=2}]]], fields=[id, name, part1, part2]) > but was: < Calc(select=[id, name, part1, part2, (part2 + 1) AS virtualField]) +- TableSourceScan(table=[[default_catalog, default_database, PartitionableTable, partitions=[{part2=2, part1=A}]]], fields=[id, name, part1, part2]) > ``` - The fix is to include ordering in `PartitionPushDownSpec` and `PushPartitionIntoTableSourceScanRuleTest` while converting from Map to String in order to maintain the same order of Map while converting to String and thus, make the tests more stable. The function : getDigests() converts the partitions to String and the ordering of different objects of Map may differ as it is nondeterministic. Also the expected string which are used in test files are hardcoded in XML files and only contain one set of ordering. Hence, we can convert the nondeterministic ordering in PartitionPushDownSpec.getDigests() to get the ordered value of Map as the string plan. ## Brief change log *(for example:)* - *The TaskInfo is stored in the blob store on job creation time as a persistent artifact* - *Deployments RPC transmits only the blob storage reference* - *TaskManagers retrieve the TaskInfo from the blob cache* ## Verifying this change This change is already covered by existing tests, such as *org.apache.flink.table.planner.plan.batch.sql.PartitionableSourceTest*. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: don't know - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org