So a high level suggestion from Robert that I want to highlight as a top-post:
Instead of focusing on just fixing the SDKs and runners Reshuffle, this could be an opportunity to introduce Redistribute which was proposed in the long-ago thread. The semantics are identical but it is more clear that it *only* is a hint about redistributing data and there is no expectation of a checkpoint. This new name may also be an opportunity to maintain update compatibility (though this may actually be leaving unsafe code in user's hands) and/or separate @RequiresStableInput/checkpointing uses of Reshuffle from redistribution-only uses of Reshuffle. Any other thoughts on this one high level bit? Kenn On Thu, Oct 5, 2023 at 11:15 AM Kenneth Knowles <k...@apache.org> wrote: > > On Wed, Oct 4, 2023 at 7:45 PM Robert Burke <lostl...@apache.org> wrote: > >> LGTM. >> >> It looks the Go SDK already adheres to these semantics as well for the >> reference impl(well, reshuffle/redistribute_randomly, _by_key isn't >> implemented in the Go SDK, and only uses the existing unqualified reshuffle >> URN [0]. >> >> The original strategy, and then for every element, the original Window, >> TS, and Pane are all serialized, shuffled, and then deserialized downstream. >> >> >> https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/reshuffle.go#L65 >> >> >> https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/reshuffle.go#L145 >> >> Prism at the moment vaccuously implements reshuffle by omitting the node, >> and rewriting the inputs and outputs [1], as it's a local runner with >> single transform per bundle execution, but I was intending to make it a >> fusion break regardless. Ultimately prism's "test" variant will default to >> executing the SDKs dictated reference implementation for the composite(s), >> and any "fast" or "prod" variant would simply do the current implementation. >> > > Very nice! > > And of course I should have linked out to the existing reshuffle URN in > the proto. > > Kenn > > > >> >> Robert Burke >> Beam Go Busybody >> >> [0]: >> https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/graphx/translate.go#L46C3-L46C50 >> [1]: >> https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/runners/prism/internal/handlerunner.go#L82 >> >> >> >> On 2023/09/26 15:43:53 Kenneth Knowles wrote: >> > Hi everyone, >> > >> > Recently there was a bug [1] caused by discrepancies between two of >> > Dataflow's reshuffle implementations. I think the reference >> implementation >> > in the Java SDK [2] also does not match. This all led to discussion on >> the >> > bug and the pull request [3] about what the actual semantics should be. >> I >> > got it wrong, maybe multiple times. So I wrote up a very short document >> to >> > finish the discussion: >> > >> > https://s.apache.org/beam-reshuffle >> > >> > This is also probably among the simplest imaginable use of >> > http://s.apache.org/ptransform-design-doc in case you want to see kind >> of >> > how I intended it to be used. >> > >> > Kenn >> > >> > [1] https://github.com/apache/beam/issues/28219 >> > [2] >> > >> https://github.com/apache/beam/blob/d52b077ad505c8b50f10ec6a4eb83d385cdaf96a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Reshuffle.java#L84 >> > [3] https://github.com/apache/beam/pull/28272 >> > >> >