Hey Everyone

I've been following the MV spec and listened in on the last community
sync.  I'd like to chime in from a query planner point of view on how the
MVs could be used.

Suppose a user has a dashboard query like:

*SELECT product, sum(sales) *
*FROM view1 *
*WHERE brand = 'X' and year = '2024'*
*GROUP BY 1 *

And it performs awful.  So, the user creates a materialized view
*view1_mat *which is basically a materialization of select * on view1.  If
the planner is smart enough, it can rewrite the original query to use
*view1**_mat *without the user needing to explicitly reference the MV.
Will the Iceberg MV spec provide metadata to help link *view1 *to its
materialization *view1_mat?*

The above is a simple example as there could be multiple materializations
for view1:

*view1_mat2* - partitioned by brand
*view1_mat3* - partitioned by year
*view1_mat4 *- filtered by the year 2024
*....*

So, from a query planner point of view, it should know about all these
candidate materializations for *view1, *cost each one separately and select
the one that results in the lowest overall plan cost.

I think it would be really great for Iceberg MV adoption if the above use
case was made easier.  Basically, it helps users leverage MVs without
explicitly referencing the MVs in their queries.

Thanks
Benny

Reply via email to