Thanks Walaa and JB.  Really looking forward to seeing how this spec
evolves next.

I think different engines who share these MVs will have very different
levels of rewrite and (incremental) refresh capabilities.  So capturing as
much as possible around the query, lineage, staleness tolerance and even
versioning will help engines decide what they can do with these MVs.

btw, I really liked the idea about the MV storage tables represented and
exposed to engines as regular Iceberg tables.

Best
Benny

On Fri, Mar 8, 2024 at 4:21 AM Jean-Baptiste Onofré <j...@nanthrax.net> wrote:

> Hi Benny
>
> Thanks for your message. I think the idea is to "smoothly"
> (implicitly) add regular table storage over a view.
>
> The MV approach is right now in discussion, without consensus so far.
> We plan to have document/meeting to discuss further.
>
> Regards
> JB
>
> On Fri, Mar 8, 2024 at 7:18 AM Benny Chow <btc...@gmail.com> wrote:
> >
> > Hey Everyone
> >
> > I've been following the MV spec and listened in on the last community
> sync.  I'd like to chime in from a query planner point of view on how the
> MVs could be used.
> >
> > Suppose a user has a dashboard query like:
> >
> > SELECT product, sum(sales)
> > FROM view1
> > WHERE brand = 'X' and year = '2024'
> > GROUP BY 1
> >
> > And it performs awful.  So, the user creates a materialized view
> view1_mat which is basically a materialization of select * on view1.  If
> the planner is smart enough, it can rewrite the original query to use
> view1_mat without the user needing to explicitly reference the MV.  Will
> the Iceberg MV spec provide metadata to help link view1 to its
> materialization view1_mat?
> >
> > The above is a simple example as there could be multiple
> materializations for view1:
> >
> > view1_mat2 - partitioned by brand
> > view1_mat3 - partitioned by year
> > view1_mat4 - filtered by the year 2024
> > ....
> >
> > So, from a query planner point of view, it should know about all these
> candidate materializations for view1, cost each one separately and select
> the one that results in the lowest overall plan cost.
> >
> > I think it would be really great for Iceberg MV adoption if the above
> use case was made easier.  Basically, it helps users leverage MVs without
> explicitly referencing the MVs in their queries.
> >
> > Thanks
> > Benny
> >
> >
> >
> >
> >
>

Reply via email to