I'm not sure that this is the right place for a discussion about the merits of their approach.
This list is for Iceberg development. I encourage anyone interested to follow up on the appropriate incubator list rather than here. I also think it's debatable whether advertising other projects is helpful or wanted here, but I'd rather not add to the noise either way. Ryan On Tue, Dec 5, 2023 at 8:36 PM Jack Ye <yezhao...@gmail.com> wrote: > I recently did an analysis of the OneTable project, overall it made me a > bit confused. > > From an end user's perspective, no one really wants to use all these 3 > formats, and most companies do not have the engineering resources to > maintain a stack of all these 3 formats. Eventually people pick one and > just stick with it. > > If the goal is to provide a converter, then individual communities have > developed different tools, such as Delta's Uniform, Iceberg's snapshot and > migrate procedures, Hudi's bootstrap methods. The advantage of those tools > is that the specific community knows the best way to convert a foreign data > source to its native format, and can declare compatibility and fail > whenever necessary. It is not bounded to the expressiveness of an internal > data model like OneTable, OneField, OneSchema, etc. > > If the goal is format unification, at least for me being in the Iceberg > community with a bit bias, a more straightforward way to achieve the goal > is to extend the feature of "Iceberg external tables", where we can map > Hive, Delta, Hudi and other table formats directly to Iceberg format behind > a REST catalog, and make that readable. This is kind of related to a > recent email thread I sent regarding the EXTERNAL/MANAGED syntax > <https://lists.apache.org/thread/ohqfvhf4wofzkhrvff1lxl58blh432o6>. And > linking back to this thread, that essentially makes Iceberg the unified > format, and we are actually pretty close to achieving that. With this > approach, you get not just conversion, you can (1) not do physical metadata > conversion but directly convert table metadata at runtime to Iceberg data > model, (2) query all the tables using a single unified Iceberg connector in > all supported engines, and (3) it is a very standardized external table > concept that all database system folks immediately understand. > > This makes me feel that we are trying to make OneTable a new table format > without saying it is a new table format. Although the Apache Incubation > proposal clearly says "OneTable is NOT a new table format", it is hard for > me to envision a long-term roadmap that does not eventually make it a table > format, with connectors and data maintenance features built directly > against this internal model, which is kind of feels like what the > commercial entity OneHouse is trying to do right now, but maybe I am wrong. > > What do you think? > > Best, > Jack Ye > > On Tue, Dec 5, 2023 at 3:30 PM Jesús Camacho Rodríguez < > jcama...@apache.org> wrote: > >> Currently, there is no established group discussions. The project was >> recently open-sourced, and communication is currently done through GitHub. >> (If the project is accepted into the ASF incubator, mailing lists will be >> created). If you're interested in regular meetings, feel free to suggest it >> to the community on GitHub. >> >> Thanks, >> Jesús >> >> >> On 2023/12/05 06:30:38 Gaurav Agarwal wrote: >> > HI >> > Thanks for this mail , I would like to know is there any group >> discussion >> > also happened or any call to discuss the issues. >> > >> > thanks >> > >> > >> > On Tue, Dec 5, 2023 at 9:29 AM Walaa Eldin Moustafa < >> wa.moust...@gmail.com> >> > wrote: >> > >> > > Thanks Jesus for sharing OneTable. Looks like it touches upon some of >> the >> > > topics we discussed in the Rise of Table Formats panel at VLDB >> > > <https://ceur-ws.org/Vol-3462/CDMS18.pdf> back in September. I was >> > > browsing through the source code, and I ran into the OneField >> > > < >> https://github.com/onetable-io/onetable/blob/main/api/src/main/java/io/onetable/model/schema/OneField.java> >> class >> > > and noticed it has support for default values, which is good, but in >> the >> > > Iceberg spec, there are two default values >> > > <https://iceberg.apache.org/spec/#default-values> (more details in >> the >> > > spec and respective PR). I was pointing this out as an example of >> small >> > > nuances that can differ from one format to another and was wondering >> how >> > > OneTable is planning to bridge them? >> > > >> > > Thanks, >> > > Walaa. >> > > >> > > >> > >> > -- Ryan Blue Tabular