Sounds good, thanks! On Wed, Sep 10, 2025 at 12:29 PM Yi Hu <[email protected]> wrote:
> Thanks and updated the doc and writing a section. I meant > "CoderLogicalType" should go with `LogicalType<byte[], T>` but doc wasn't > updated with it. Sorry for confusion. Will use proposal 2b including the > "EncodedBytesLogicalType" naming to proceed. > > Regards, > > Yi > > On Tue, Sep 9, 2025 at 3:37 PM Robert Bradshaw <[email protected]> wrote: > >> On Tue, Sep 9, 2025 at 10:01 AM Yi Hu via dev <[email protected]> >> wrote: >> >>> Thanks all for the input. It helped a lot for the doc. From the >>> feedback, >>> >> >> Thanks for pursuing this and all the iteration on the doc. >> >> >>> - The main concern is that RawType/CoderLogicalType break the strong >>> mapping of schema<->coder. This is a valid concern. >>> >>> - On the other hand, it is a way to make schemas the fundamental concept >>> (which is a goal of Beam 3) under the situation that Beam and its ecosystem >>> has already evolved for years with many Beam pipelines using (non-portable) >>> coders+custom types. >>> >>> From these feedbacks, I suggest we proceed with CoderLogicalType >>> approach, given the requirements noted in "Requirement" section of the doc, >>> and in addition, >>> >>> - We should clearly document that this approach, if implemented, should >>> not be used to bypass the schema framework. We always encourage schema-fy >>> structured types. >>> >>> I'll start drafting changes for each supported SDK. >>> >> >> I don't think the doc yet reflects the actual discussion, including what >> I think the consensus was in >> https://docs.google.com/document/d/1PggR27eg96Y8TzB9L29PszrMHPwL9u-JDTQ5N0Vc_5I/edit?disco=AAABqm7vQMU >> (but it's worth fleshing this out to ensure we have the same idea of what >> we think we're all agreeing on). I added this as option 2b. >> >> It may seem trivial, but I also think we should avoid the name >> "CoderLogicalType" and go with something like EncodedBytesLogicalType (open >> to other suggestions) which more accurately reflects the fact that coders >> may not be present in all SDKs. >> >> - Robert >> >> >> >>> On Wed, Sep 3, 2025 at 1:51 PM Yi Hu <[email protected]> wrote: >>> >>>> Hi all, >>>> >>>> Please find the following design doc for a portable RAW field type >>>> enabling arbitrary (serializable) data type to be included and take >>>> advantage of the Beam portable schema framework >>>> >>>> https://s.apache.org/beam-portable-raw-type >>>> >>>> It aims to solve https://github.com/apache/beam/issues/23374 (as well >>>> as https://github.com/apache/beam/issues/19817) as part of schema >>>> improvement for Beam 3 (https://github.com/apache/beam/issues/34672). >>>> >>>> It also includes an appendix of term disambiguation between >>>> Beam/Flink/Avro schema systems that might find useful in general. >>>> >>>> I proposed two alternative designs. Any feedback is welcome! >>>> >>>> Regards, >>>> >>>> Yi >>>> >>>> -- >>>> >>>> Yi Hu, (he/him/his) >>>> >>>> Software Engineer >>>> >>>> >>>>
