Indeed, there's nothing stopping you from doing codegen where it's useful but I think it's probably easier to implement codegen from dynamic than it is to go the other way around (Avro vs Proto)
On Fri, Aug 5, 2022 at 1:15 PM Chamikara Jayalath <chamik...@google.com> wrote: > > > On Fri, Aug 5, 2022 at 12:00 PM Byron Ellis <byronel...@google.com> wrote: > >> I think there are some practical advantages to having the ability to >> support a dynamic version---at previous places where I've worked having >> Kafka's Schema Service was incredibly useful for data processing (it was a >> Java/Scala shop and we mostly used a "decode to POJO" approach rather than >> codegen.) >> > > Yeah, that's my thought as well. I think it will be pretty useful during > development/testing cycles, especially if we push code generation to the > release time. Also, it will be useful for trying out any SchemaTransforms > developed/released by third parties where generated stubs might not be > available. > > >> >> On Fri, Aug 5, 2022 at 10:08 AM Chamikara Jayalath via dev < >> dev@beam.apache.org> wrote: >> >>> >>> >>> On Fri, Aug 5, 2022 at 9:44 AM Brian Hulette <bhule...@google.com> >>> wrote: >>> >>>> Thanks Cham! I really like the proposal, I left a few comments. I also >>>> had one higher-level point I wanted to elevate here: >>>> >>>> > Pipeline SDKs can generate user-friendly stub-APIs based on >>>> transforms registered with an expansion service, eliminating the need to >>>> develop language-specific wrappers. >>>> This would be great! I think one point to consider is whether we can do >>>> this statically. We could package up these stubs with releases and include >>>> them in API docs for each language, making them much more discoverable. >>>> That could be an extension on top of your proposal (e.g. as part of its >>>> build, each SDK spins up other known expansion services and generates code >>>> based on the discovery responses), but maybe it could be cleaner if we >>>> don't really need the dynamic version? >>>> >>> >>> So my proposal suggested two solutions for wrappers. >>> * A higher level (dynamic) API (SchemaAwareExternalTransform) that can >>> be used to discover/initialize/use any SchemaTransform. >>> * Developing tooling to generate stubs for each language. This is >>> possible since SchemaTransform gives a cleaner way to define/interpret the >>> construction API of a transform. >>> >>> I think both can be useful. For example, the prior might be useful to >>> quickly test/try out new SchemaTransforms without going through code >>> generation. >>> >>> Also, I agree with you that it might be good to generate such stubs (and >>> corresponding docs) during release time instead of generating and >>> committing stubs to the repo. >>> >>> Thanks, >>> Cham >>> >>> >>>> >>>> Brian >>>> >>>> >>>> On Thu, Aug 4, 2022 at 6:51 PM Chamikara Jayalath via dev < >>>> dev@beam.apache.org> wrote: >>>> >>>>> Hi All, >>>>> >>>>> I believe we can make the multi-language pipelines offering [1] much >>>>> easier to use by updating the expansion service to be fully aware of >>>>> SchemaTransforms. Additionally this will make it easy to >>>>> register/discover/use transforms defined in one SDK from all other SDKs. >>>>> Specifically we could add the following features. >>>>> >>>>> - Expansion service can be used to easily initialize and expand >>>>> transforms without need for additional code. >>>>> - Expansion service can be used to easily discover already >>>>> registered transforms. >>>>> - Pipeline SDKs can generate user-friendly stub-APIs based on >>>>> transforms registered with an expansion service, eliminating the need >>>>> to >>>>> develop language-specific wrappers. >>>>> >>>>> Please see here for my proposal: >>>>> https://s.apache.org/easy-multi-language >>>>> >>>>> Lemme know if you have any comments/questions/suggestions :) >>>>> >>>>> Thanks, >>>>> Cham >>>>> >>>>> [1] >>>>> https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines >>>>> >>>>>