Sounds good, thanks!

On Wed, Sep 10, 2025 at 12:29 PM Yi Hu <[email protected]> wrote:

> Thanks and updated the doc and writing a section. I meant
> "CoderLogicalType" should go with `LogicalType<byte[], T>` but doc wasn't
> updated with it. Sorry for confusion. Will use proposal 2b including the
> "EncodedBytesLogicalType" naming to proceed.
>
> Regards,
>
> Yi
>
> On Tue, Sep 9, 2025 at 3:37 PM Robert Bradshaw <[email protected]> wrote:
>
>> On Tue, Sep 9, 2025 at 10:01 AM Yi Hu via dev <[email protected]>
>> wrote:
>>
>>> Thanks all for the input. It helped a lot for the doc. From the
>>> feedback,
>>>
>>
>> Thanks for pursuing this and all the iteration on the doc.
>>
>>
>>> - The main concern is that RawType/CoderLogicalType break the strong
>>> mapping of schema<->coder. This is a valid concern.
>>>
>>> - On the other hand, it is a way to make schemas the fundamental concept
>>> (which is a goal of Beam 3) under the situation that Beam and its ecosystem
>>> has already evolved for years with many Beam pipelines using (non-portable)
>>> coders+custom types.
>>>
>>> From these feedbacks, I suggest we proceed with CoderLogicalType
>>> approach, given the requirements noted in "Requirement" section of the doc,
>>> and in addition,
>>>
>>> - We should clearly document that this approach, if implemented, should
>>> not be used to bypass the schema framework. We always encourage schema-fy
>>> structured types.
>>>
>>> I'll start drafting changes for each supported SDK.
>>>
>>
>> I don't think the doc yet reflects the actual discussion, including what
>> I think the consensus was in
>> https://docs.google.com/document/d/1PggR27eg96Y8TzB9L29PszrMHPwL9u-JDTQ5N0Vc_5I/edit?disco=AAABqm7vQMU
>> (but it's worth fleshing this out to ensure we have the same idea of what
>> we think we're all agreeing on). I added this as option 2b.
>>
>> It may seem trivial, but I also think we should avoid the name
>> "CoderLogicalType" and go with something like EncodedBytesLogicalType (open
>> to other suggestions) which more accurately reflects the fact that coders
>> may not be present in all SDKs.
>>
>> - Robert
>>
>>
>>
>>> On Wed, Sep 3, 2025 at 1:51 PM Yi Hu <[email protected]> wrote:
>>>
>>>> Hi all,
>>>>
>>>> Please find the following design doc for a portable RAW field type
>>>> enabling arbitrary (serializable) data type to be included and take
>>>> advantage of the Beam portable schema framework
>>>>
>>>> https://s.apache.org/beam-portable-raw-type
>>>>
>>>> It aims to solve https://github.com/apache/beam/issues/23374 (as well
>>>> as https://github.com/apache/beam/issues/19817) as part of schema
>>>> improvement for Beam 3 (https://github.com/apache/beam/issues/34672).
>>>>
>>>> It also includes an appendix of term disambiguation between
>>>> Beam/Flink/Avro schema systems that might find useful in general.
>>>>
>>>> I proposed two alternative designs. Any feedback is welcome!
>>>>
>>>> Regards,
>>>>
>>>> Yi
>>>>
>>>> --
>>>>
>>>> Yi Hu, (he/him/his)
>>>>
>>>> Software Engineer
>>>>
>>>>
>>>>

Reply via email to