Hi All, Thanks for the comments so far. Seems like we generally agree on this proposal.
Please see https://github.com/apache/beam/pull/22802 for a prototype implementation that adds the following. * Support for dynamically discovering and registering SchemaTransforms in the Java expansion service. * Support for dynamically discovering registered SchemaTransforms from the Python side. * Support for using SchemaTransforms in Python pipelines. Feel free to add more comments to the doc and/or the PR. Thanks, Cham On Mon, Aug 8, 2022 at 9:34 PM Chamikara Jayalath <chamik...@google.com> wrote: > I think the *DiscoverSchemaTransform()* RPC introduced in this proposal > and the ability to easily deploy/use available *SchemaTransforms* using > an expansion service essentially provide the tooling necessary for > implementing such a service. Such a service could even startup expansion > services to discover/list transforms available in given artifacts (for > example, jar files). > > Thanks, > Cham > > On Mon, Aug 8, 2022 at 3:48 PM Byron Ellis <byronel...@google.com> wrote: > >> I like that idea, sort of like Kafka’s Schema Service but for transforms? >> >> On Mon, Aug 8, 2022 at 2:45 PM Robert Bradshaw via dev < >> dev@beam.apache.org> wrote: >> >>> This is a great idea. I would like to approach this from the >>> perspective of making it easy to provide a catalog of well-defined >>> transforms for use in expansion services from typical SDKs and also >>> elsewhere (e.g. for documentation purposes, GUIs, etc.) Ideally >>> everything about what a transform is (its config, documentation, >>> expectations on inputs, etc.) can be specified programmatically in a >>> way that's much easier to both author and consume than it is now. >>> >>> On Thu, Aug 4, 2022 at 6:51 PM Chamikara Jayalath via dev >>> <dev@beam.apache.org> wrote: >>> > >>> > Hi All, >>> > >>> > I believe we can make the multi-language pipelines offering [1] much >>> easier to use by updating the expansion service to be fully aware of >>> SchemaTransforms. Additionally this will make it easy to >>> register/discover/use transforms defined in one SDK from all other SDKs. >>> Specifically we could add the following features. >>> > >>> > Expansion service can be used to easily initialize and expand >>> transforms without need for additional code. >>> > Expansion service can be used to easily discover already registered >>> transforms. >>> > Pipeline SDKs can generate user-friendly stub-APIs based on transforms >>> registered with an expansion service, eliminating the need to develop >>> language-specific wrappers. >>> > >>> > Please see here for my proposal: >>> https://s.apache.org/easy-multi-language >>> > >>> > Lemme know if you have any comments/questions/suggestions :) >>> > >>> > Thanks, >>> > Cham >>> > >>> > [1] >>> https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines >>> > >>> >>