Hi everyone,

Recently there was a bug [1] caused by discrepancies between two of
Dataflow's reshuffle implementations. I think the reference implementation
in the Java SDK [2] also does not match. This all led to discussion on the
bug and the pull request [3] about what the actual semantics should be. I
got it wrong, maybe multiple times. So I wrote up a very short document to
finish the discussion:

    https://s.apache.org/beam-reshuffle

This is also probably among the simplest imaginable use of
http://s.apache.org/ptransform-design-doc in case you want to see kind of
how I intended it to be used.

Kenn

[1] https://github.com/apache/beam/issues/28219
[2]
https://github.com/apache/beam/blob/d52b077ad505c8b50f10ec6a4eb83d385cdaf96a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Reshuffle.java#L84
[3] https://github.com/apache/beam/pull/28272

Reply via email to