GitHub user ahmedahamid opened a pull request: https://github.com/apache/samza/pull/637
SAMZA-1860: Modularize Join input validation in ExecutionPlanner This change breaks down the validation of partition counts of input and intermediate streams participating in Join operations into 3 separate steps: 1. Grouping `InputOperatorSpec`s by the `JoinOperatorSpec`s of the Join operations they participate in 2. Replacing `InputOperatorSpec`s with their corresponding `StreamEdge`s 3. Verifying/Inferring partition counts of input/intermediate streams You can merge this pull request into a Git repository by running: $ git pull https://github.com/ahmedahamid/samza dev/ahabdulh/modularize-exec-planner Alternatively you can review and apply these changes as the patch at: https://github.com/apache/samza/pull/637.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #637 ---- commit 88fdad8c004243b376b1980999ed82581d3be796 Author: Ahmed Abdul Hamid <ahabdulh@...> Date: 2018-09-05T02:02:11Z SAMZA-1838: Make some minor improvements to ExecutionPlanner This commit includes the following changes: - Fix case where ExecutionPlanner did not throw in response to joining 2 input streams with different partition counts - Improve some method names in ExecutionPlanner - Improve some method/field names in JobGraph - Make minor improvements to createJobGraph() - Rewrite updateExistingPartitions() to make it a little easier to follow - Use more constrained OperatorSpec types in the associations defined in calculateJoinInputPartitions() - Have calculateIntStreamPartitions() throw in response to bad config for job.intermediate.stream.partitions - Improve some error messages commit c8763d2ed60e376cd81734c6461817fa7c9fd3f9 Author: Ahmed Abdul Hamid <ahabdulh@...> Date: 2018-09-12T04:42:13Z SAMZA-1860: Modularize Join input validation in ExecutionPlanner This change breaks down the validation of partition counts of input and intermediate streams participating in Join operations into 3 separate steps: 1. Grouping InputOperatorSpecs by the JoinOperatorSpecs of the Join operations they participate in 2. Replacing InputOperatorSpecs with their corresponding StreamEdges 3. Verifying/Inferring partition counts of input/intermediate streams ---- ---