Re: Easy Multi-language via a SchemaTransform-aware Expansion Service

Chamikara Jayalath via dev Fri, 05 Aug 2022 13:16:05 -0700

On Fri, Aug 5, 2022 at 12:00 PM Byron Ellis <byronel...@google.com> wrote:


> I think there are some practical advantages to having the ability to
> support a dynamic version---at previous places where I've worked having
> Kafka's Schema Service was incredibly useful for data processing (it was a
> Java/Scala shop and we mostly used a "decode to POJO" approach rather than
> codegen.)
>

Yeah, that's my thought as well. I think it will be pretty useful during
development/testing cycles, especially if we push code generation to the
release time. Also, it will be useful for trying out any SchemaTransforms
developed/released by third parties where generated stubs might not be
available.


>
> On Fri, Aug 5, 2022 at 10:08 AM Chamikara Jayalath via dev <
> dev@beam.apache.org> wrote:
>
>>
>>
>> On Fri, Aug 5, 2022 at 9:44 AM Brian Hulette <bhule...@google.com> wrote:
>>
>>> Thanks Cham! I really like the proposal, I left a few comments. I also
>>> had one higher-level point I wanted to elevate here:
>>>
>>> > Pipeline SDKs can generate user-friendly stub-APIs based on transforms
>>> registered with an expansion service, eliminating the need to develop
>>> language-specific wrappers.
>>> This would be great! I think one point to consider is whether we can do
>>> this statically. We could package up these stubs with releases and include
>>> them in API docs for each language, making them much more discoverable.
>>> That could be an extension on top of your proposal (e.g. as part of its
>>> build, each SDK spins up other known expansion services and generates code
>>> based on the discovery responses), but maybe it could be cleaner if we
>>> don't really need the dynamic version?
>>>
>>
>> So my proposal suggested two solutions for wrappers.
>> * A higher level (dynamic) API (SchemaAwareExternalTransform) that can be
>> used to discover/initialize/use any SchemaTransform.
>> * Developing tooling to generate stubs for each language. This is
>> possible since SchemaTransform gives a cleaner way to define/interpret the
>> construction API of a transform.
>>
>> I think both can be useful. For example, the prior might be useful to
>> quickly test/try out new SchemaTransforms without going through code
>> generation.
>>
>> Also, I agree with you that it might be good to generate such stubs (and
>> corresponding docs) during release time instead of generating and
>> committing stubs to the repo.
>>
>> Thanks,
>> Cham
>>
>>
>>>
>>> Brian
>>>
>>>
>>> On Thu, Aug 4, 2022 at 6:51 PM Chamikara Jayalath via dev <
>>> dev@beam.apache.org> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I believe we can make the multi-language pipelines offering [1] much
>>>> easier to use by updating the expansion service to be fully aware of
>>>> SchemaTransforms. Additionally this will make it easy to
>>>> register/discover/use transforms defined in one SDK from all other SDKs.
>>>> Specifically we could add the following features.
>>>>
>>>>    - Expansion service can be used to easily initialize and expand
>>>>    transforms without need for additional code.
>>>>    - Expansion service can be used to easily discover already
>>>>    registered transforms.
>>>>    - Pipeline SDKs can generate user-friendly stub-APIs based on
>>>>    transforms registered with an expansion service, eliminating the need to
>>>>    develop language-specific wrappers.
>>>>
>>>> Please see here for my proposal:
>>>> https://s.apache.org/easy-multi-language
>>>>
>>>> Lemme know if you have any comments/questions/suggestions :)
>>>>
>>>> Thanks,
>>>> Cham
>>>>
>>>> [1]
>>>> https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines
>>>>
>>>>

Re: Easy Multi-language via a SchemaTransform-aware Expansion Service

Reply via email to