+1 on moving this to the Parquet project/community (assuming that the
Parquet community is ok with this)

On Thu, Aug 22, 2024 at 3:02 AM Chao Sun <sunc...@apache.org> wrote:

> +1 too
>
> On Wed, Aug 21, 2024 at 4:43 PM huaxin gao <huaxin.ga...@gmail.com> wrote:
>
>> +1 for moving variant type to Parquet, as it promotes standardization and
>> interoperability across numerous projects.
>>
>> Huaxin
>>
>> On Wed, Aug 21, 2024 at 1:28 PM Yufei Gu <flyrain...@gmail.com> wrote:
>>
>>> Agreed that Parquet would be a good place to host the new type.
>>> Different table formats, like Iceberg and Delta can benefit from it as they
>>> have based on parquet already.
>>>
>>> Yufei
>>>
>>>
>>> On Wed, Aug 21, 2024 at 12:15 AM Alkis Evlogimenos
>>> <alkis.evlogime...@databricks.com.invalid> wrote:
>>>
>>>> +1
>>>>
>>>> In addition to everything said above, it is also a great opportunity
>>>> for wider testing and possibly tweaking the spec before it takes off post
>>>> standardization.
>>>>
>>>> On Tue, Aug 20, 2024 at 4:36 PM Russell Spitzer <
>>>> russell.spit...@gmail.com> wrote:
>>>>
>>>>> I think this would be a great move to encourage all sorts of engines
>>>>> and table formats to take advantage of variant type and make sure it
>>>>> remains compatible between all those systems.
>>>>>
>>>>> I strongly support this,
>>>>> Russ
>>>>>
>>>>> On Tue, Aug 20, 2024 at 8:06 AM Fokko Driesprong <fo...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Hey everyone,
>>>>>>
>>>>>> I agree the Parquet project is a good place to host and evolve the
>>>>>> spec (we could store it in parquet-variant?). We would need to align this
>>>>>> with the Parquet project. Anyway, I'm familiar both with Iceberg and
>>>>>> Parquet and happy to help where needed.
>>>>>>
>>>>>> Kind regards,
>>>>>> Fokko
>>>>>>
>>>>>>
>>>>>> Op ma 19 aug 2024 om 16:36 schreef Reynold Xin
>>>>>> <r...@databricks.com.invalid>:
>>>>>>
>>>>>>> As I said on dev@iceberg, it'd be really unfortunate if we end up
>>>>>>> with two or even more diverging specs for storing variants. It just adds
>>>>>>> more work for everybody to interop. Parquet would be a great home for 
>>>>>>> this
>>>>>>> spec as a neutral project that almost all the other important projects 
>>>>>>> in
>>>>>>> this space depend on as the de facto standard for physical data encoding
>>>>>>> and storage. So if we can collaborate with the Parquet community and get
>>>>>>> this into Parquet to avoid each project building its own spec, that'd be
>>>>>>> amazing.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Aug 17, 2024 at 2:56 AM Gene Pang <gene.p...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I am one of the main developers implementing Variant in Spark. The
>>>>>>>> specification and all the code are currently merged into the
>>>>>>>> common/variant
>>>>>>>> <https://github.com/apache/spark/tree/master/common/variant>
>>>>>>>> package in the Spark repo.
>>>>>>>>
>>>>>>>> There has been growing interest from other projects (such as
>>>>>>>> Iceberg) in supporting Variant, and we think that moving the Variant 
>>>>>>>> spec
>>>>>>>> and implementation out to a new home might be the best way for all the
>>>>>>>> different projects to be able to use and collaborate on Variant. We
>>>>>>>> originally put all the Variant code under common/variant with the
>>>>>>>> expectation that eventually it would be moved elsewhere.
>>>>>>>>
>>>>>>>> We are proposing that we move the Variant spec and implementation
>>>>>>>> out of the Spark project, to the Parquet project. Spark depends 
>>>>>>>> heavily on
>>>>>>>> Parquet, and the Variant spec contains a lot of details on the physical
>>>>>>>> storage layer, such as shredding. The Parquet project would be a great
>>>>>>>> place to standardize the Variant data type, and to enable 
>>>>>>>> interoperability
>>>>>>>> across many different projects. However, even when we move Variant 
>>>>>>>> out, we
>>>>>>>> expect to retain the compatibility with the current Spark 
>>>>>>>> implementation.
>>>>>>>>
>>>>>>>> What do people think? There are probably many details we still need
>>>>>>>> to figure out in terms of moving the implementation, but at a 
>>>>>>>> high-level,
>>>>>>>> does it make sense to move Variant to Parquet?
>>>>>>>>
>>>>>>>> I appreciate your feedback!
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Gene
>>>>>>>>
>>>>>>>

Reply via email to