On Fri, Aug 5, 2022 at 12:00 PM Byron Ellis <byronel...@google.com> wrote:
> I think there are some practical advantages to having the ability to > support a dynamic version---at previous places where I've worked having > Kafka's Schema Service was incredibly useful for data processing (it was a > Java/Scala shop and we mostly used a "decode to POJO" approach rather than > codegen.) > Yeah, that's my thought as well. I think it will be pretty useful during development/testing cycles, especially if we push code generation to the release time. Also, it will be useful for trying out any SchemaTransforms developed/released by third parties where generated stubs might not be available. > > On Fri, Aug 5, 2022 at 10:08 AM Chamikara Jayalath via dev < > dev@beam.apache.org> wrote: > >> >> >> On Fri, Aug 5, 2022 at 9:44 AM Brian Hulette <bhule...@google.com> wrote: >> >>> Thanks Cham! I really like the proposal, I left a few comments. I also >>> had one higher-level point I wanted to elevate here: >>> >>> > Pipeline SDKs can generate user-friendly stub-APIs based on transforms >>> registered with an expansion service, eliminating the need to develop >>> language-specific wrappers. >>> This would be great! I think one point to consider is whether we can do >>> this statically. We could package up these stubs with releases and include >>> them in API docs for each language, making them much more discoverable. >>> That could be an extension on top of your proposal (e.g. as part of its >>> build, each SDK spins up other known expansion services and generates code >>> based on the discovery responses), but maybe it could be cleaner if we >>> don't really need the dynamic version? >>> >> >> So my proposal suggested two solutions for wrappers. >> * A higher level (dynamic) API (SchemaAwareExternalTransform) that can be >> used to discover/initialize/use any SchemaTransform. >> * Developing tooling to generate stubs for each language. This is >> possible since SchemaTransform gives a cleaner way to define/interpret the >> construction API of a transform. >> >> I think both can be useful. For example, the prior might be useful to >> quickly test/try out new SchemaTransforms without going through code >> generation. >> >> Also, I agree with you that it might be good to generate such stubs (and >> corresponding docs) during release time instead of generating and >> committing stubs to the repo. >> >> Thanks, >> Cham >> >> >>> >>> Brian >>> >>> >>> On Thu, Aug 4, 2022 at 6:51 PM Chamikara Jayalath via dev < >>> dev@beam.apache.org> wrote: >>> >>>> Hi All, >>>> >>>> I believe we can make the multi-language pipelines offering [1] much >>>> easier to use by updating the expansion service to be fully aware of >>>> SchemaTransforms. Additionally this will make it easy to >>>> register/discover/use transforms defined in one SDK from all other SDKs. >>>> Specifically we could add the following features. >>>> >>>> - Expansion service can be used to easily initialize and expand >>>> transforms without need for additional code. >>>> - Expansion service can be used to easily discover already >>>> registered transforms. >>>> - Pipeline SDKs can generate user-friendly stub-APIs based on >>>> transforms registered with an expansion service, eliminating the need to >>>> develop language-specific wrappers. >>>> >>>> Please see here for my proposal: >>>> https://s.apache.org/easy-multi-language >>>> >>>> Lemme know if you have any comments/questions/suggestions :) >>>> >>>> Thanks, >>>> Cham >>>> >>>> [1] >>>> https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines >>>> >>>>