Hi Chen,

I think the best starting point would be to create a FLIP [1]. One of the
important topics from my point of view is to make sure that such changes
are not only available for SQL users, but are also being considered for
Table API, DataStream and/or Python. There might be reasons why not to do
that, but then those considerations should also be captured in the FLIP.

Another thing that would be interesting is how Thrift translates into Flink
connectors & Flink formats. Or is your Thrift implementation only a
connector?

Best regards,

Martijn

[1]
https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals

Op zo 29 mei 2022 om 19:06 schreef Chen Qin <qinnc...@gmail.com>:

> Hi there,
>
> We would like to discuss and potentially upstream our thrift support
> patches to flink.
>
> For some context, we have been internally patched flink-1.11.2 to support
> FlinkSQL jobs read/write to thrift encoded kafka source/sink. Over the
> course of last 12 months, those patches supports a few features not
> available in open source master, including
>
>    - allow user defined inference thrift stub class name in table DDL,
>    Thrift binary <-> Row
>    - dynamic overwrite schema type information loaded from HiveCatalog
>    (Table only)
>    - forward compatible when kafka topic encode with new schema (adding new
>    field)
>    - backward compatible when job with new schema handles input or state
>    with old schema
>
> With more FlinkSQL jobs in production, we expect maintenance of divergent
> feature sets to increase in the next 6-12 months. Specifically challenges
> around
>
>    - lack of systematic way to support inference based table/view ddl
>    (parity with hiveql serde
>    <
> https://cwiki.apache.org/confluence/display/hive/serde#:~:text=SerDe%20Overview,-SerDe%20is%20short&text=Hive%20uses%20the%20SerDe%20interface,HDFS%20in%20any%20custom%20format
> .>
>    )
>    - lack of robust mapping from thrift field to row field
>    - dynamic update set of table with same inference class when performing
>    schema change (e.g adding new field)
>    - minor lack of handle UNSET case, use NULL
>
> Please kindly provide pointers around the challenges section.
>
> Thanks,
> Chen, Pinterest.
>

Reply via email to