Well, everybody that actively contributed to the discussion on the
original google doc was in consensus. That's why I brought up the topic
at the Community Sync on the 2024-02-14
(https://youtu.be/uAQVGd5zV4I?t=890) to raise the awareness of the
broader community. After which the discussion about the storage model
started. I don't think that the discussion about a single aspect of a
proposal should invalidate all other aspects of the proposal.
Regardless, the state of the proposal from the original google doc
contains a lot of valuable contributions from Micah, Szehon, Jack, Dan,
yourself and others and it should at least provide the basis for any
further discussion. I don't think it's effective to start with a
completely different design because we are bound to have the same
discussions all over again.
Thanks, Jan
On 08.05.24 12:11, Walaa Eldin Moustafa wrote:
The only consensus the community had was on the object model through
the most recent voting thread [1]. This kind of consensus was not
present during the doc discussions, and this should be evident from
the fact the last doc state listed 5 alternatives with no particular
conclusion. I am not quite sure what type of consensus we are
referring to here given all the follow up discussions, alternatives, etc.
Due to the separate object model, the PR is fundamentally different
from the doc in the sense it does not propose a new metadata model but
rather formalizes some new table and view properties related to MVs.
That is also one reason there are no repeated discussions. That said,
if you feel there is a repeated discussion (which I do not see so
far), it would be best to link the relevant discussion from the doc in
a comment.
Happy to move the discussion elsewhere if there is sufficient support
for this idea, but as things stand, I do not see this as an efficient
way to make progress. It sounds we have been re-emphasizing the same
points in the last two replies, so I will let others chime in at this
point.
[1] https://lists.apache.org/thread/rotmqzmwk5jrcsyxhzjhrvcjs5v3yjcc
Thanks,
Walaa.
On Wed, May 8, 2024 at 2:31 AM Jan Kaul <jank...@mailbox.org.invalid>
wrote:
The original google doc
<https://docs.google.com/document/d/1UnhldHhe3Grz8JBngwXPA6ZZord1xMedY5ukEhZYF-A/edit?usp=sharing>
discussed multiple aspects of the Materialized View spec. One was
the storage model while others were related to the metadata. After
we (Micah, Szehon, you, me) reached consensus in the google doc,
Jack raised his concern about the storage model and the long
discussion about the storage model started. Now we truly reached
consensus about the storage model, which is now also reflected in
the google doc. All other aspects from the google doc about the
metadata weren't questioned and still represent the consensus.
I would like to *avoid repeating the discussions* in your PR that
we already had in the google doc. Especially since we reached
consensus which took a considerable amount of time.
Thanks, Jan
On 08.05.24 10:21, Walaa Eldin Moustafa wrote:
Thanks Jan. I think we moved on to more alignment steps beyond
that doc a while ago. After that doc, we have discussed the topic
further in 2 dev list threads and one more doc
<https://docs.google.com/document/d/1zg0wQ5bVKTckf7-K_cdwF4mlRi6sixLcyEh6jErpGYY/edit?pli=1>
(with strictly two options for the storage model to consider).
Moreover, the original doc grew to 14 pages long with one section
comparing 5 design alternatives, which made things harder to
reach consensus. The lack of consensus is what partly led up to
the subsequent discussions and call for a more focused approach
to reach consensus. If we already have a consensus on the storage
model (separate tables and views), I think we should take things
further and have continued focused discussions on the specific
metadata in the form of a PR. I have included all previous
discussions including the original doc and issue as references in
the PR description. Please let me know if this works. Happy to
hear others' thoughts on the best way to move forward.
Thanks,
Walaa.
On Wed, May 8, 2024 at 12:56 AM Jan Kaul
<jank...@mailbox.org.invalid>
<mailto:jank...@mailbox.org.invalid> wrote:
Thanks Walaa for trying to move things along. However I don't
think it's a good idea to start a separate discussion about
the metadata for materialized views because we already had
this discussion and reached consensus in this google doc:
https://docs.google.com/document/d/1UnhldHhe3Grz8JBngwXPA6ZZord1xMedY5ukEhZYF-A/edit?usp=sharing
<https://docs.google.com/document/d/1UnhldHhe3Grz8JBngwXPA6ZZord1xMedY5ukEhZYF-A/edit?usp=sharing>
Once the draft is finalized we can adopt the PR to reflect
the consensus from the google doc.
Best wishes,
Jan
On 07.05.24 19:11, Walaa Eldin Moustafa wrote:
Thanks Steven. I feel it is needed so the MV spec is not
scattered across the table and view spec pages. We may add a
reference in each respective properties section.
On Tue, May 7, 2024 at 10:04 AM Steven Wu
<stevenz...@gmail.com> wrote:
Walaa, thanks for initiating the next step.
With the agreed model of separate view and storage
table, I am wondering if a separate materialized view
spec page is needed. E.g., the new view metadata
(view-materialized and view-storage-table) is probably
good to be added to the view page directly to avoid
information scattering. The same can be said about the
storage table metadata.
We may keep the separate materialized view page to
document motivation, freshness semantics, etc..
On Mon, May 6, 2024 at 10:58 PM Walaa Eldin Moustafa
<wa.moust...@gmail.com> wrote:
Hi Everyone,
Thanks again for participating in the modeling
discussion [1]. Since the outcome of this discussion
was to model materialized views as separate objects,
an Iceberg view and a table, I think the next step
should be discussing the metadata details for each
object. I have created a PR
https://github.com/apache/iceberg/pull/10280 with an
initial spec improvement. Please feel free to review
it and leave feedback there.
[1]
https://lists.apache.org/thread/rotmqzmwk5jrcsyxhzjhrvcjs5v3yjcc
Thanks,
Walaa.