Re: Easy Multi-language via a SchemaTransform-aware Expansion Service

Byron Ellis via dev Fri, 05 Aug 2022 13:48:10 -0700

Indeed, there's nothing stopping you from doing codegen where it's useful
but I think it's probably easier to implement codegen from dynamic than it
is to go the other way around (Avro vs Proto)


On Fri, Aug 5, 2022 at 1:15 PM Chamikara Jayalath <chamik...@google.com>
wrote:

>
>
> On Fri, Aug 5, 2022 at 12:00 PM Byron Ellis <byronel...@google.com> wrote:
>
>> I think there are some practical advantages to having the ability to
>> support a dynamic version---at previous places where I've worked having
>> Kafka's Schema Service was incredibly useful for data processing (it was a
>> Java/Scala shop and we mostly used a "decode to POJO" approach rather than
>> codegen.)
>>
>
> Yeah, that's my thought as well. I think it will be pretty useful during
> development/testing cycles, especially if we push code generation to the
> release time. Also, it will be useful for trying out any SchemaTransforms
> developed/released by third parties where generated stubs might not be
> available.
>
>
>>
>> On Fri, Aug 5, 2022 at 10:08 AM Chamikara Jayalath via dev <
>> dev@beam.apache.org> wrote:
>>
>>>
>>>
>>> On Fri, Aug 5, 2022 at 9:44 AM Brian Hulette <bhule...@google.com>
>>> wrote:
>>>
>>>> Thanks Cham! I really like the proposal, I left a few comments. I also
>>>> had one higher-level point I wanted to elevate here:
>>>>
>>>> > Pipeline SDKs can generate user-friendly stub-APIs based on
>>>> transforms registered with an expansion service, eliminating the need to
>>>> develop language-specific wrappers.
>>>> This would be great! I think one point to consider is whether we can do
>>>> this statically. We could package up these stubs with releases and include
>>>> them in API docs for each language, making them much more discoverable.
>>>> That could be an extension on top of your proposal (e.g. as part of its
>>>> build, each SDK spins up other known expansion services and generates code
>>>> based on the discovery responses), but maybe it could be cleaner if we
>>>> don't really need the dynamic version?
>>>>
>>>
>>> So my proposal suggested two solutions for wrappers.
>>> * A higher level (dynamic) API (SchemaAwareExternalTransform) that can
>>> be used to discover/initialize/use any SchemaTransform.
>>> * Developing tooling to generate stubs for each language. This is
>>> possible since SchemaTransform gives a cleaner way to define/interpret the
>>> construction API of a transform.
>>>
>>> I think both can be useful. For example, the prior might be useful to
>>> quickly test/try out new SchemaTransforms without going through code
>>> generation.
>>>
>>> Also, I agree with you that it might be good to generate such stubs (and
>>> corresponding docs) during release time instead of generating and
>>> committing stubs to the repo.
>>>
>>> Thanks,
>>> Cham
>>>
>>>
>>>>
>>>> Brian
>>>>
>>>>
>>>> On Thu, Aug 4, 2022 at 6:51 PM Chamikara Jayalath via dev <
>>>> dev@beam.apache.org> wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>> I believe we can make the multi-language pipelines offering [1] much
>>>>> easier to use by updating the expansion service to be fully aware of
>>>>> SchemaTransforms. Additionally this will make it easy to
>>>>> register/discover/use transforms defined in one SDK from all other SDKs.
>>>>> Specifically we could add the following features.
>>>>>
>>>>>    - Expansion service can be used to easily initialize and expand
>>>>>    transforms without need for additional code.
>>>>>    - Expansion service can be used to easily discover already
>>>>>    registered transforms.
>>>>>    - Pipeline SDKs can generate user-friendly stub-APIs based on
>>>>>    transforms registered with an expansion service, eliminating the need 
>>>>> to
>>>>>    develop language-specific wrappers.
>>>>>
>>>>> Please see here for my proposal:
>>>>> https://s.apache.org/easy-multi-language
>>>>>
>>>>> Lemme know if you have any comments/questions/suggestions :)
>>>>>
>>>>> Thanks,
>>>>> Cham
>>>>>
>>>>> [1]
>>>>> https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines
>>>>>
>>>>>

Re: Easy Multi-language via a SchemaTransform-aware Expansion Service

Reply via email to