Re: [VOTE] Move Variant to Parquet

2024-09-03 Thread Gene Pang
10:28 AM Micah Kornfield wrote: > I think maybe we should finalize the details before having a vote, to make > sure everyone understands the implications? > > On Tue, Sep 3, 2024 at 9:12 AM Gene Pang wrote: > >> Hi, >> >> In general, the Iceberg community is

Re: [VOTE] Move Variant to Parquet

2024-09-03 Thread Gene Pang
Muralidharan wrote: > Hi, > > What was the conclusions of discussions with Parquet and Iceberg > communities on this ? > > Thanks, > Mridul > > On Mon, Sep 2, 2024 at 12:48 PM Gene Pang wrote: > >> Hi all, >> >> I’d like to start a vote for moving the Var

[VOTE] Move Variant to Parquet

2024-09-02 Thread Gene Pang
Hi all, I’d like to start a vote for moving the Variant specification and library to the Parquet project. This allows the Variant binary format and shredding format to be more widely used by other interested projects and systems. Please refer to the discussion thread: https://lists.apache.org/thr

[DISCUSS] Move Variant to Parquet?

2024-08-16 Thread Gene Pang
Hi all, I am one of the main developers implementing Variant in Spark. The specification and all the code are currently merged into the common/variant package in the Spark repo. There has been growing interest from other projects (such

[DISCUSS] Variant shredding specification

2024-06-03 Thread Gene Pang
Hi all, We have been working on the Variant data type, which is designed to store and process semi-structured data efficiently, even with heterogeneous values. Users can store and process semi-structured data in a flexible way, without having to specify or know any fixed schema on write. Variant d