Re: Materialized view integration with REST spec

2024-04-17 Thread Walaa Eldin Moustafa
As an update, there is more common understanding now of the options in the doc. Please feel free to take another look. The most relevant comment at this point is this comment . Based on thi

Re: Materialized view integration with REST spec

2024-04-17 Thread Renjie Liu
Kindly remind to review and discuss the proposal in doc. On Thu, Apr 4, 2024 at 9:22 PM Jean-Baptiste Onofré wrote: > Just to clarify: I think we have a consensus on the two possible > options. So the vote could be helpful to have a consensus about which > option. > > Anyway, we still have discu

Re: Materialized view integration with REST spec

2024-04-04 Thread Jean-Baptiste Onofré
Just to clarify: I think we have a consensus on the two possible options. So the vote could be helpful to have a consensus about which option. Anyway, we still have discussions going on on this topic :) Regards JB On Wed, Apr 3, 2024 at 10:02 PM Ryan Blue wrote: > > If there is consensus, great

Re: Materialized view integration with REST spec

2024-04-03 Thread Walaa Eldin Moustafa
I think we want to get clarity on the "combined object" approach. Some discussions are still going on. There is one particular thread that would benefit from some more clarification. Would

Re: Materialized view integration with REST spec

2024-04-03 Thread Ryan Blue
If there is consensus, great. We don't usually have a vote when there is already consensus. That said, I haven't really seen a confirmation that we have consensus, like a thread where people that originally had different perspectives all said they favored the same option. It can help to build clar

Re: Materialized view integration with REST spec

2024-04-03 Thread Jean-Baptiste Onofré
I thought we have a consensus in the doc at least on the possible option. I understood the vote was to adopt one of the options (that is possible for a vote). If we still need more discussion on the possible options or having a consensus on a specific option, it makes sense to continue the discuss

Re: Materialized view integration with REST spec

2024-04-02 Thread Walaa Eldin Moustafa
Sounds good regarding voting to confirm on the direction. We can continue to try to reach consensus. I believe having common understanding about what we are evaluating (or building consensus towards) is crucial to have productive discussions. I see Jan has added more comments to the doc this mornin

Re: Materialized view integration with REST spec

2024-04-02 Thread Daniel Weeks
I don't think we're in a position to open a vote (or maybe there's a misunderstanding of what the vote is set out to achieve). We need to continue the discussion until there is a general consensus on the direction we want to go (not on what options are available). The vote is a confirmation of th

Re: Materialized view integration with REST spec

2024-04-01 Thread Jean-Baptiste Onofré
Hi Walaa Yes, I think it makes sense to go with a vote, now that pros/cons are clearly state in the doc. Thanks ! Regards JB On Tue, Apr 2, 2024 at 3:59 AM Walaa Eldin Moustafa wrote: > > Hi all, there has not been new activity on the doc for some time. Should we > consider voting? > > On Thu,

Re: Materialized view integration with REST spec

2024-04-01 Thread Renjie Liu
+1, I think we have been clear enough about the pros and cons of each option. On Tue, Apr 2, 2024 at 10:00 AM Walaa Eldin Moustafa wrote: > Hi all, there has not been new activity on the doc > > for some

Re: Materialized view integration with REST spec

2024-04-01 Thread Walaa Eldin Moustafa
Hi all, there has not been new activity on the doc for some time. Should we consider voting? On Thu, Mar 28, 2024 at 6:59 AM Jean-Baptiste Onofré wrote: > Yes, correct, thanks Manu for pointing it out. >

Re: Materialized view integration with REST spec

2024-03-28 Thread Jean-Baptiste Onofré
Yes, correct, thanks Manu for pointing it out. Thanks ! Regards JB On Thu, Mar 28, 2024 at 9:55 AM Manu Zhang wrote: > > I think Jan already created it > https://github.com/apache/iceberg/issues/10043 > > Jean-Baptiste Onofré 于2024年3月28日 周四16:46写道: >> >> Hi Walaa, >> >> Yes, I think it would be

Re: Materialized view integration with REST spec

2024-03-28 Thread Manu Zhang
I think Jan already created it https://github.com/apache/iceberg/issues/10043 Jean-Baptiste Onofré 于2024年3月28日 周四16:46写道: > Hi Walaa, > > Yes, I think it would be great to create the GH Issue with the > proposal template, it would allow us to track the proposal and link > the doc (the comments sh

Re: Materialized view integration with REST spec

2024-03-28 Thread Renjie Liu
Hi, Walaa: +1 for creating a github issue for tracking it with a new template. On Thu, Mar 28, 2024 at 4:47 PM Jean-Baptiste Onofré wrote: > Hi Walaa, > > Yes, I think it would be great to create the GH Issue with the > proposal template, it would allow us to track the proposal and link > the d

Re: Materialized view integration with REST spec

2024-03-28 Thread Jean-Baptiste Onofré
Hi Walaa, Yes, I think it would be great to create the GH Issue with the proposal template, it would allow us to track the proposal and link the doc (the comments should go in the doc directly). Please, let me know if I can help on that. I'm working on a PR to list the proposals on the website an

Re: Materialized view integration with REST spec

2024-03-27 Thread Walaa Eldin Moustafa
Do we need to create a proposal issue specifically to track this doc? Also, everyone, since there has been some updates, would be good to chime in again to discuss the updates. (doc link here for convenien

Re: Materialized view integration with REST spec

2024-03-26 Thread Jean-Baptiste Onofré
It sounds good. I would also propose to use the "proposal process": creating a github issue with the "proposal" tag and link the document there in a comment. Regards JB On Tue, Mar 26, 2024 at 3:05 PM Walaa Eldin Moustafa wrote: > > Thanks Jan! To avoid spreading discussions on multiple places,

Re: Materialized view integration with REST spec

2024-03-26 Thread Walaa Eldin Moustafa
Thanks Jan! To avoid spreading discussions on multiple places, I will continue the comments on the doc. Also it is easier to run into communication gaps in email threads since effectively we have one thread, but in docs we have many. Thanks, Walaa. On Tue, Mar 26, 2024 at 6:27 AM Jan Kaul wrote:

Re: Materialized view integration with REST spec

2024-03-26 Thread Jan Kaul
I've added a description to the "Combined metadata" Option of Walaa's document. I'm also adding it here: This option treats the underlying view and storage table as a combined catalog object. The operation of this combined approach can be best demonstrated by looking at the different layers of

Re: Materialized view integration with REST spec

2024-03-26 Thread Jan Kaul
I think it makes sense if I use the "Description" section of your document to clarify how I imagine a combined MV solution to look like. This would simplify the discussion about pros and cons, because we can reference or extend the description. I will try to find the time later today. Thanks,

Re: Materialized view integration with REST spec

2024-03-25 Thread Walaa Eldin Moustafa
Thanks Jan! I am not sure if you would like to make suggestions to revise the options themselves or the current options pros and cons. In either case, as mentioned earlier, we can do that on the doc and once we agree on the options and their pros and cons we can move forward. How does that sound?

Re: Materialized view integration with REST spec

2024-03-25 Thread Jan Kaul
I have the feeling that the current pros and cons from the summary target a version of the MV spec that wasn't really part of the discussion. The current arguments target a completely new specification for materialized views which we agreed on, is out of scope. Instead of a completely new speci

Re: Materialized view integration with REST spec

2024-03-25 Thread Benny Chow
Hi Manu This is Walaa's Spark implementation for option 1: https://github.com/apache/iceberg/pull/9830/files/a9e1bee3b5bf5914e5330d3b195042aea33868c9 There's no code for option 2 yet. Best Benny On Mon, Mar 25, 2024 at 12:37 AM Manu Zhang wrote: > Thanks Walaa for the summary. It's unclear to

Re: Materialized view integration with REST spec

2024-03-25 Thread Manu Zhang
Thanks Walaa for the summary. It's unclear to me which are the reference implementation for option 1 and reference MV spec for option 2 from the context. I can find some links in the References section but not sure which should be referred to respectively. On Mon, Mar 25, 2024 at 3:38 AM Walaa Eld

Re: Materialized view integration with REST spec

2024-03-24 Thread Walaa Eldin Moustafa
Thanks Himadri for the questions. At this point, our objective is to have a common understanding of both options and their pros and cons. The best way to achieve this is to iterate on the doc to discuss the details of each option or their pros and cons. We can always add more details or update the

Re: Materialized view integration with REST spec

2024-03-24 Thread himadri pal
Thanks Waala for sharing the design document and the discussion. Few suggestions and points to consider *First* : *How does it lay on the underlying Storage (HDFS/S3/ ...)* As I understood, option 1 is preferred from an implementation perspective and also some engines require a semantic storage t

Re: Materialized view integration with REST spec

2024-03-24 Thread Manish Malhotra
Thanks Walaa, Option1 seems to be a better one, and one of the primary reason is how to keep it simple for the engine. Regrds, Manish On Sun, Mar 24, 2024 at 5:02 AM Renjie Liu wrote: > Hi, Walaa: > > Thanks for your summary. I lean toward option 1, due to the huge effort > for engines to adop

Re: Materialized view integration with REST spec

2024-03-22 Thread Szehon Ho
Sounds good to me, can you start a document then, and we can all contribute there? On Fri, Mar 22, 2024 at 10:47 AM Walaa Eldin Moustafa wrote: > Let us list the pros and cons as originally planned. I can help as well if > needed. We can get started and have Jack chime in when he is back? > > On

Re: Materialized view integration with REST spec

2024-03-22 Thread Walaa Eldin Moustafa
Let us list the pros and cons as originally planned. I can help as well if needed. We can get started and have Jack chime in when he is back? On Fri, Mar 22, 2024 at 10:35 AM Szehon Ho wrote: > Hi > > My understanding was last time it was still unresolved, and the action > item was on Jack and/o

Re: Materialized view integration with REST spec

2024-03-22 Thread Szehon Ho
Hi My understanding was last time it was still unresolved, and the action item was on Jack and/or/ Jan to make a shorter document. I think the debate now has boiled down to Ryan's three options: 1. separate table/view 2. combination of table/view tied together via commit 3. new metadata

Re: Materialized view integration with REST spec

2024-03-22 Thread Renjie Liu
+1 On Fri, Mar 22, 2024 at 16:42 Jean-Baptiste Onofré wrote: > Hi Renjie, > > We discussed the MV proposal, without yet reaching any conclusion. > > I propose: > - to use the "new" proposal process in place (creating an GH issue with > proposal flag, with link to the document) > - use the docume

Re: Materialized view integration with REST spec

2024-03-22 Thread Jean-Baptiste Onofré
Hi Renjie, We discussed the MV proposal, without yet reaching any conclusion. I propose: - to use the "new" proposal process in place (creating an GH issue with proposal flag, with link to the document) - use the document and/or GH issue to add comments - finalize the document heading to a vote (

Re: Materialized view integration with REST spec

2024-03-21 Thread Renjie Liu
Hi: Sorry I didn't make it to join the last community sync. Did we reach any conclusion about mv spec? On Tue, Mar 5, 2024 at 11:28 PM himadri pal wrote: > For me the calendar link did not work in mobile, but I was able to add the > dev Google calendar from > https://iceberg.apache.org/communit

Re: Materialized view integration with REST spec

2024-03-05 Thread himadri pal
For me the calendar link did not work in mobile, but I was able to add the dev Google calendar from https://iceberg.apache.org/community/#iceberg-community-events by accessing it from laptop. Regards, Himadri Pal On Mon, Mar 4, 2024 at 4:43 PM Walaa Eldin Moustafa wrote: > Thanks Jack! I thin

Re: Materialized view integration with REST spec

2024-03-04 Thread Jack Ye
Thanks Jan! +1 for everyone to take a look before the discussion, and see if there are any missing options or major arguments. I have also added the images regarding all the options, it might be easier to parse than the big sheet. I will also put it here for people that do not have time to read th

Re: Materialized view integration with REST spec

2024-03-01 Thread Walaa Eldin Moustafa
The calendar on the site is currently broken https://iceberg.apache.org/community/#iceberg-community-events. Might help to fix it or share the meeting link here. On Fri, Mar 1, 2024 at 3:43 PM Jack Ye wrote: > Sounds good, let's discuss this in person! > > I am a bit worried that we have quite a

Re: Materialized view integration with REST spec

2024-03-01 Thread Jack Ye
Sounds good, let's discuss this in person! I am a bit worried that we have quite a few critical topics going on right now on devlist, and this will take up a lot of time to discuss. If it ends up going for too long, l propose let us have a dedicated meeting, and I am more than happy to organize it

Re: Materialized view integration with REST spec

2024-03-01 Thread Ryan Blue
Hey everyone, I think this thread has hit a point of diminishing returns and that we still don't have a common understanding of what the options under consideration actually are. Since we were already planning on discussing this at the next community sync, I suggest we pick this up there and use

Re: Materialized view integration with REST spec

2024-03-01 Thread Walaa Eldin Moustafa
I am finding it hard to interpret the options concretely. I would also suggest breaking the expectation/outcome to milestones. Maybe it becomes easier if we agree to distinguish between an approach that is feasible in the near term and another in the long term, especially if the latter requires sig

Re: Materialized view integration with REST spec

2024-03-01 Thread Jack Ye
> All of these approaches are aligned in one, specific way: the storage table is an iceberg table. I do not think that is true. I think people are aligned that we would like to re-use the Iceberg table metadata defined in the Iceberg table spec to express the data in MV, but I don't think it goes

Re: Materialized view integration with REST spec

2024-03-01 Thread Daniel Weeks
I feel I've been most vocal about pushing back against options 2+ (or Ryan's categories of combined table/view, or new metadata type), so I'll try to expand on my reasoning. I understand the appeal of creating a design where we encapsulate the view/storage from both a structural and performance st

Re: Materialized view integration with REST spec

2024-02-29 Thread Jack Ye
> Jack, it sounds like you’re the proponent of a combined table and view (rather than a new metadata spec for a materialized view). What is the main motivation? It seems like you’re convinced of that approach, but I don’t understand the advantage it brings. Sorry I have to make a Google Sheet to c

Re: Materialized view integration with REST spec

2024-02-29 Thread Jan Kaul
Hi Ryan, we actually discussed your categories in this question . Where your categories correspond to the following designs: * Separate table and view => Design 1 * Combination o

Re: Materialized view integration with REST spec

2024-02-29 Thread Walaa Eldin Moustafa
Ok since the option "A combination of a view and a table" also has some sort of a pointer (from the view to the table in the view metadata), and it is rejected, I think the key distinction with the first option "Separate table and view", is that in the second option, a specific version of the table

Re: Materialized view integration with REST spec

2024-02-29 Thread Ryan Blue
> Ryan, in the option "Separate table and view", will there be a reference (or pointer) to the table from the view metadata? Yes. And this is a problem we need to solve generally because a materialized table needs to be able to track the upstream state of tables that were used. I think it would be

Re: Materialized view integration with REST spec

2024-02-29 Thread Walaa Eldin Moustafa
Ryan, in the option "Separate table and view", will there be a reference (or pointer) to the table from the view metadata? Since the option of "embedding a table metadata location in view metadata" is not preferred, it is not clear how to associate the table with the view in the "Separate table and

Re: Materialized view integration with REST spec

2024-02-29 Thread Ryan Blue
Looks like it wasn’t clear what I meant for the 3 categories, so I’ll be more specific: - *Separate table and view*: this option is to have the objects that we have today, with extra metadata. Commit processes are separate: committing to the table doesn’t alter the view and committing to

Re: Materialized view integration with REST spec

2024-02-29 Thread Walaa Eldin Moustafa
One additional point advantage of the separate view and table approach is it will save the need to change all engine catalog APIs to expose materialized views as separate objects with their own engine catalog APIs. Hece, Iceberg can add the materialized view support without being blocked on other e

Re: Materialized view integration with REST spec

2024-02-29 Thread Szehon Ho
Hi Yes I mostly agree with the assessment. To clarify a few minor points. is a materialized view a view and a separate table, a combination of the > two (i.e. commits are combined), or a new metadata type? For 'new metadata type', I consider mostly Jack's initial proposal of a new Catalog MV o

Re: Materialized view integration with REST spec

2024-02-29 Thread Jan Kaul
Hi all, I would like to provide my perspective on the question of what a materialized view is and elaborate on Jack's recent proposal to view a materialized view as a catalog concept. Firstly, let's look at the role of the catalog. Every entity in the catalog has a *unique identifier*, and t

Re: Materialized view integration with REST spec

2024-02-28 Thread Walaa Eldin Moustafa
Thanks Ryan for the insights. I agree that reusing existing metadata definitions and minimizing spec changes are very important. This also minimizes spec drift (between materialized views and views spec, and between materialized views and tables spec), and simplifies the implementation. In an effo

Re: Materialized view integration with REST spec

2024-02-28 Thread Ryan Blue
I mean separate table and view metadata that is somehow combined through a commit process. For instance, keeping a pointer to a table metadata file in a view metadata file or combining commits to reference both. I don't see the value in either option. On Wed, Feb 28, 2024 at 5:05 PM Jack Ye wrote

Re: Materialized view integration with REST spec

2024-02-28 Thread Jack Ye
Sorry I guess another longer question: *What do we even mean here when we use the terms of table "metadata", view "metadata" and new "metadata" type?* This was clear before the REST spec was introduced, but is not so clear now. Maybe this is a good time to clarify it. If we look into the table/v

Re: Materialized view integration with REST spec

2024-02-28 Thread Jack Ye
Thanks Ryan for the help to trace back to the root question! Just a clarification question regarding your reply before I reply further: what exactly does the option "a combination of the two (i.e. commits are combined)" mean? How is that different from "a new metadata type"? -Jack On Wed, Feb

Re: Materialized view integration with REST spec

2024-02-28 Thread Ryan Blue
I’m catching up on this conversation, so hopefully I can bring a fresh perspective. Jack already pointed out that we need to start from the basics and I agree with that. Let’s remove voting at this point. Right now is the time for discussing trade-offs, not lining up and taking sides. I realize th

Re: Materialized view integration with REST spec

2024-02-22 Thread Szehon Ho
Hi Jan I agree with Walaa, I think the new Question should be narrow (View = View + Materialization, or new MV metadata), with 3 options (Materialization can be metadata.json or nested object). We can mention that with the former, we have another decision whether to register it (and then refer to

Re: Materialized view integration with REST spec

2024-02-21 Thread Walaa Eldin Moustafa
Thanks Jack! I feel Question 0 is very broad, essentially capturing the whole design. Can we start by discussing more granular questions? On Wed, Feb 21, 2024 at 8:53 PM Jack Ye wrote: > Thanks everyone for the help in organizing the thoughts! > > I have moved the summary of everyone's comments

Re: Materialized view integration with REST spec

2024-02-21 Thread Jack Ye
Thanks everyone for the help in organizing the thoughts! I have moved the summary of everyone's comments here also to the doc that Jan linked under question 0. We can continue to have more discussions there and cast votes! Best, Jack Ye On Wed, Feb 21, 2024 at 12:14 PM Jan Kaul wrote: > Thanks

Re: Materialized view integration with REST spec

2024-02-21 Thread Jan Kaul
Thanks Micah, I think the voting chips are great. @Szehon, actually what I had in mind was not to have one thread per question but rather have smaller threads that can be resolved more easily. I have the fear that one thread for the current question would lead to a very long and unmanageable d

Re: Materialized view integration with REST spec

2024-02-21 Thread Micah Kornfield
> > Of course we also need threads that express our preferences (voting). I > would suggest to keep these separate from discussions about single points > so that they can be persisted in the document. Not sure if it helpful, but I added voting chips Question 0, as maybe an easier way to keep trac

Re: Materialized view integration with REST spec

2024-02-21 Thread Szehon Ho
Thanks Jan. +1 on having just one thread per question for vote/preference. Where do you suggest we have it, on the discussion question itself? It would be to keep the existing threads and move it there. Also, I think it makes sense with making a slack channel (for quick question, reply) , and a

Re: Materialized view integration with REST spec

2024-02-21 Thread Jan Kaul
Thank you Jack for driving the consensus for the MV spec and thank you all for the discussion. I really like the idea about incremental consensus because we often loose sight in detailed discussions. As Jack mentioned, the highest priority question currently is: *Should the Iceberg MV be reali

Re: Materialized view integration with REST spec

2024-02-20 Thread Manish Malhotra
Very excited for MV to be in Iceberg :) Keeping in the same doc. would be helpful, to have the trail. But also agreed, if there are too many directions/threads, then keep closing the old one, if there are no more questions. And put down the assumptions for the initial version to move forward. On

Re: Materialized view integration with REST spec

2024-02-20 Thread Walaa Eldin Moustafa
I would vote to keep a log in the doc with open questions, and keep the doc updated with open questions as they arise/get resolved. On Tue, Feb 20, 2024 at 11:37 AM Jack Ye wrote: > Thanks for the response from everyone! > > Before proceeding further, I see a few people referring back to the > c

Re: Materialized view integration with REST spec

2024-02-20 Thread Jack Ye
Thanks for the response from everyone! Before proceeding further, I see a few people referring back to the current design from Jan. I specifically raised this thread based on the information in the doc and a few latest discussions we had there. Because there are many threads in the doc, and each t

Re: Materialized view integration with REST spec

2024-02-19 Thread Szehon Ho
Hi, Great to see more discussion on the MV spec. Actually, Jan's document "Iceberg Materialized View Spec" has been organized , with a "Design Questions" section to track these debates, and it would be nice to centr

Re: Materialized view integration with REST spec

2024-02-19 Thread Walaa Eldin Moustafa
I think it would help if we answer the question of whether an MV is a view + storage table (and degree of exposing this underlying implementation) in the context of the user interfacing with those concepts: For the end user, interfacing with the engine APIs (e.g., through SQL), materialized view A

Re: Materialized view integration with REST spec

2024-02-19 Thread Micah Kornfield
Hi Jack, > In my mind, the first key point we all need to agree upon to move this > design forward is*: Do we really want to go with the MV = view + storage > table design approach for Iceberg MV?* I think we want this to the extent that we do not want to redefine the same concept with differen

Re: Materialized view integration with REST spec

2024-02-19 Thread Daniel Weeks
To address the specific question about MV = view + storage, I do feel that is the right approach. (The alternative would actually fit more cleanly with the "materialized table" concept, but there are a lot of reasons that probably isn't a great path to go down.) In many ways the materialized view

Re: Materialized view integration with REST spec

2024-02-19 Thread Jack Ye
I suggest we need a step-by-step process to make incremental consensus, otherwise we are constantly talking about many different debates at the same time. In my mind, the first key point we all need to agree upon to move this design forward is*: Do we really want to go with the MV = view + storage

Re: Materialized view integration with REST spec

2024-02-19 Thread Daniel Weeks
Jack, I think we should consider either allowing the storage table to be fully exposed/addressable via the catalog or allow access via namespacing like with metadata tables. E.g. ..., which would allow for full access to the underlying table. For other engines to interact with the storage table

Re: Materialized view integration with REST spec

2024-02-18 Thread Renjie Liu
Hi, Jack: Thanks for raising this. In most database systems, MV, view and table are considered independent > objects, at least at API level. It is very rare for a system to support > operations like "materializing a logical view" or "upgrading a logical view > to MV", because view and MV are very

Materialized view integration with REST spec

2024-02-16 Thread Jack Ye
Hi everyone, As we are discussing the spec change for materialized view, there has been a missing aspect that is technically also related, and might affect the MV spec design: *how do we want to add MV support to the REST spec?* I would like to discuss this in a new thread to collect people's tho