On Wed, Oct 4, 2023 at 7:45 PM Robert Burke <lostl...@apache.org> wrote:

> LGTM.
>
> It looks the Go SDK already adheres to these semantics as well for the
> reference impl(well, reshuffle/redistribute_randomly, _by_key isn't
> implemented in the Go SDK, and only uses the existing unqualified reshuffle
> URN [0].
>
> The original strategy, and then for every element, the original Window,
> TS, and Pane are all serialized, shuffled, and then deserialized downstream.
>
>
> https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/reshuffle.go#L65
>
>
> https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/reshuffle.go#L145
>
> Prism at the moment vaccuously implements reshuffle by omitting the node,
> and rewriting the inputs and outputs [1], as it's a local runner with
> single transform per bundle execution, but I was intending to make it a
> fusion break regardless.  Ultimately prism's "test" variant will default to
> executing the SDKs dictated reference implementation for the composite(s),
> and any "fast" or "prod" variant would simply do the current implementation.
>

Very nice!

And of course I should have linked out to the existing reshuffle URN in the
proto.

Kenn



>
> Robert Burke
> Beam Go Busybody
>
> [0]:
> https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/graphx/translate.go#L46C3-L46C50
> [1]:
> https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/runners/prism/internal/handlerunner.go#L82
>
>
>
> On 2023/09/26 15:43:53 Kenneth Knowles wrote:
> > Hi everyone,
> >
> > Recently there was a bug [1] caused by discrepancies between two of
> > Dataflow's reshuffle implementations. I think the reference
> implementation
> > in the Java SDK [2] also does not match. This all led to discussion on
> the
> > bug and the pull request [3] about what the actual semantics should be. I
> > got it wrong, maybe multiple times. So I wrote up a very short document
> to
> > finish the discussion:
> >
> >     https://s.apache.org/beam-reshuffle
> >
> > This is also probably among the simplest imaginable use of
> > http://s.apache.org/ptransform-design-doc in case you want to see kind
> of
> > how I intended it to be used.
> >
> > Kenn
> >
> > [1] https://github.com/apache/beam/issues/28219
> > [2]
> >
> https://github.com/apache/beam/blob/d52b077ad505c8b50f10ec6a4eb83d385cdaf96a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Reshuffle.java#L84
> > [3] https://github.com/apache/beam/pull/28272
> >
>

Reply via email to