Re: Easy Multi-language via a SchemaTransform-aware Expansion Service

Byron Ellis via dev Fri, 05 Aug 2022 12:00:30 -0700

I think there are some practical advantages to having the ability to
support a dynamic version---at previous places where I've worked having
Kafka's Schema Service was incredibly useful for data processing (it was a
Java/Scala shop and we mostly used a "decode to POJO" approach rather than
codegen.)


On Fri, Aug 5, 2022 at 10:08 AM Chamikara Jayalath via dev <
dev@beam.apache.org> wrote:

>
>
> On Fri, Aug 5, 2022 at 9:44 AM Brian Hulette <bhule...@google.com> wrote:
>
>> Thanks Cham! I really like the proposal, I left a few comments. I also
>> had one higher-level point I wanted to elevate here:
>>
>> > Pipeline SDKs can generate user-friendly stub-APIs based on transforms
>> registered with an expansion service, eliminating the need to develop
>> language-specific wrappers.
>> This would be great! I think one point to consider is whether we can do
>> this statically. We could package up these stubs with releases and include
>> them in API docs for each language, making them much more discoverable.
>> That could be an extension on top of your proposal (e.g. as part of its
>> build, each SDK spins up other known expansion services and generates code
>> based on the discovery responses), but maybe it could be cleaner if we
>> don't really need the dynamic version?
>>
>
> So my proposal suggested two solutions for wrappers.
> * A higher level (dynamic) API (SchemaAwareExternalTransform) that can be
> used to discover/initialize/use any SchemaTransform.
> * Developing tooling to generate stubs for each language. This is possible
> since SchemaTransform gives a cleaner way to define/interpret the
> construction API of a transform.
>
> I think both can be useful. For example, the prior might be useful to
> quickly test/try out new SchemaTransforms without going through code
> generation.
>
> Also, I agree with you that it might be good to generate such stubs (and
> corresponding docs) during release time instead of generating and
> committing stubs to the repo.
>
> Thanks,
> Cham
>
>
>>
>> Brian
>>
>>
>> On Thu, Aug 4, 2022 at 6:51 PM Chamikara Jayalath via dev <
>> dev@beam.apache.org> wrote:
>>
>>> Hi All,
>>>
>>> I believe we can make the multi-language pipelines offering [1] much
>>> easier to use by updating the expansion service to be fully aware of
>>> SchemaTransforms. Additionally this will make it easy to
>>> register/discover/use transforms defined in one SDK from all other SDKs.
>>> Specifically we could add the following features.
>>>
>>>    - Expansion service can be used to easily initialize and expand
>>>    transforms without need for additional code.
>>>    - Expansion service can be used to easily discover already
>>>    registered transforms.
>>>    - Pipeline SDKs can generate user-friendly stub-APIs based on
>>>    transforms registered with an expansion service, eliminating the need to
>>>    develop language-specific wrappers.
>>>
>>> Please see here for my proposal:
>>> https://s.apache.org/easy-multi-language
>>>
>>> Lemme know if you have any comments/questions/suggestions :)
>>>
>>> Thanks,
>>> Cham
>>>
>>> [1]
>>> https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines
>>>
>>>

Re: Easy Multi-language via a SchemaTransform-aware Expansion Service

Reply via email to