Hi everyone, Recently there was a bug [1] caused by discrepancies between two of Dataflow's reshuffle implementations. I think the reference implementation in the Java SDK [2] also does not match. This all led to discussion on the bug and the pull request [3] about what the actual semantics should be. I got it wrong, maybe multiple times. So I wrote up a very short document to finish the discussion:
https://s.apache.org/beam-reshuffle This is also probably among the simplest imaginable use of http://s.apache.org/ptransform-design-doc in case you want to see kind of how I intended it to be used. Kenn [1] https://github.com/apache/beam/issues/28219 [2] https://github.com/apache/beam/blob/d52b077ad505c8b50f10ec6a4eb83d385cdaf96a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Reshuffle.java#L84 [3] https://github.com/apache/beam/pull/28272