+1 for this FLIP. VARIANT type support will be a great addition to sql.
Look forward to the detailed design of the subsequent shredding
optimizations.


Best,
Lincoln Lee


Timo Walther <twal...@apache.org> 于2025年4月22日周二 16:51写道:

> +1 for this feature. Having a VARIANT type makes a lot of sense and
> together with an OBJECT type will make semi-structured data processing
> in Flink easier.
>
> Currently, I'm catching up with notifications after the easter holidays,
> but happy to give some feedback by tomorrow or Thursday as well.
>
> Thanks,
> Timo
>
> On 22.04.25 10:40, Jingsong Li wrote:
> > Thanks Xuannan for driving this discussion.
> >
> > At present, communities such as Apache Iceberg, Delta, Spark, Parquet,
> > etc. are all designing and developing around Variant, and our Flink
> > support for Variant is very valuable.
> >
> > After a rough look at the design, there is no overall problem. It is
> > designed around Parquet's Variant standard, which is similar to the
> > overall design of Spark SQL.
> >
> > +1 for this.
> >
> > Best,
> > Jingsong
> >
> > On Mon, Apr 14, 2025 at 6:12 PM Xuannan Su <suxuanna...@gmail.com>
> wrote:
> >>
> >> Hi devs,
> >>
> >> I’d like to start a discussion around FLIP-521: Integrating Variant
> >> Type into Flink: Enabling Efficient Semi-Structured Data
> >> Processing[1]. Working with semi-structured data has long been a
> >> foundational scenario of the Lakehouse. While JSON has traditionally
> >> served as the primary storage format for such data, its implementation
> >> as serialized strings introduces significant inefficiencies.
> >>
> >> In this FLIP, we integrate the Variant encoding, which is a compact
> >> binary representation of semi-structured data[2], to improve the
> >> performance of processing semi-structured data. As Paimon has
> >> supported the Variant type recently[3], this FLIP would allow Flink to
> >> further leverage Paimon's storage-layer optimizations, improving
> >> performance and resource utilization for semi-structured data
> >> pipelines.
> >>
> >> Best,
> >> Xuannan
> >>
> >> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-521%3A+Integrating+Variant+Type+into+Flink%3A+Enabling+Efficient+Semi-Structured+Data+Processing
> >> [2]
> https://github.com/apache/parquet-format/blob/master/VariantEncoding.md
> >> [3] https://github.com/apache/paimon/issues/4471
> >
>
>

Reply via email to