I'm not sure that this is the right place for a discussion about the merits
of their approach.

This list is for Iceberg development. I encourage anyone interested to
follow up on the appropriate incubator list rather than here.

I also think it's debatable whether advertising other projects is helpful
or wanted here, but I'd rather not add to the noise either way.

Ryan

On Tue, Dec 5, 2023 at 8:36 PM Jack Ye <yezhao...@gmail.com> wrote:

> I recently did an analysis of the OneTable project, overall it made me a
> bit confused.
>
> From an end user's perspective, no one really wants to use all these 3
> formats, and most companies do not have the engineering resources to
> maintain a stack of all these 3 formats. Eventually people pick one and
> just stick with it.
>
> If the goal is to provide a converter, then individual communities have
> developed different tools, such as Delta's Uniform, Iceberg's snapshot and
> migrate procedures, Hudi's bootstrap methods. The advantage of those tools
> is that the specific community knows the best way to convert a foreign data
> source to its native format, and can declare compatibility and fail
> whenever necessary. It is not bounded to the expressiveness of an internal
> data model like OneTable, OneField, OneSchema, etc.
>
> If the goal is format unification, at least for me being in the Iceberg
> community with a bit bias, a more straightforward way to achieve the goal
> is to extend the feature of "Iceberg external tables", where we can map
> Hive, Delta, Hudi and other table formats directly to Iceberg format behind
> a REST catalog, and make that readable. This is kind of related to a
> recent email thread I sent regarding the EXTERNAL/MANAGED syntax
> <https://lists.apache.org/thread/ohqfvhf4wofzkhrvff1lxl58blh432o6>. And
> linking back to this thread, that essentially makes Iceberg the unified
> format, and we are actually pretty close to achieving that. With this
> approach, you get not just conversion, you can (1) not do physical metadata
> conversion but directly convert table metadata at runtime to Iceberg data
> model, (2) query all the tables using a single unified Iceberg connector in
> all supported engines, and (3) it is a very standardized external table
> concept that all database system folks immediately understand.
>
> This makes me feel that we are trying to make OneTable a new table format
> without saying it is a new table format. Although the Apache Incubation
> proposal clearly says "OneTable is NOT a new table format", it is hard for
> me to envision a long-term roadmap that does not eventually make it a table
> format, with connectors and data maintenance features built directly
> against this internal model, which is kind of feels like what the
> commercial entity OneHouse is trying to do right now, but maybe I am wrong.
>
> What do you think?
>
> Best,
> Jack Ye
>
> On Tue, Dec 5, 2023 at 3:30 PM Jesús Camacho Rodríguez <
> jcama...@apache.org> wrote:
>
>> Currently, there is no established group discussions. The project was
>> recently open-sourced, and communication is currently done through GitHub.
>> (If the project is accepted into the ASF incubator, mailing lists will be
>> created). If you're interested in regular meetings, feel free to suggest it
>> to the community on GitHub.
>>
>> Thanks,
>> Jesús
>>
>>
>> On 2023/12/05 06:30:38 Gaurav Agarwal wrote:
>> > HI
>> > Thanks for this mail , I would like to know is there any group
>> discussion
>> > also happened or any call to discuss the issues.
>> >
>> > thanks
>> >
>> >
>> > On Tue, Dec 5, 2023 at 9:29 AM Walaa Eldin Moustafa <
>> wa.moust...@gmail.com>
>> > wrote:
>> >
>> > > Thanks Jesus for sharing OneTable. Looks like it touches upon some of
>> the
>> > > topics we discussed in the Rise of Table Formats panel at VLDB
>> > > <https://ceur-ws.org/Vol-3462/CDMS18.pdf> back in September. I was
>> > > browsing through the source code, and I ran into the OneField
>> > > <
>> https://github.com/onetable-io/onetable/blob/main/api/src/main/java/io/onetable/model/schema/OneField.java>
>> class
>> > > and noticed it has support for default values, which is good, but in
>> the
>> > > Iceberg spec, there are two default values
>> > > <https://iceberg.apache.org/spec/#default-values> (more details in
>> the
>> > > spec and respective PR). I was pointing this out as an example of
>> small
>> > > nuances that can differ from one format to another and was wondering
>> how
>> > > OneTable is planning to bridge them?
>> > >
>> > > Thanks,
>> > > Walaa.
>> > >
>> > >
>> >
>>
>

-- 
Ryan Blue
Tabular

Reply via email to