Thank you for your comments. I should have provided a user story to make the use case more clear.

While the WAP pattern is probably the most common usage for the branching feature of iceberg tables, it could also be used in different ways. The following is a user story showcasing the branching for views and iceberg tables without the WAP pattern.

User story:

There is an existing data pipeline that ingests data from an operational system into an iceberg table called "table_staging". The table "table_staging" is used in the query definition of a view called "view_cleaned".  A data engineer is given the task to drop a column of the table "table_staging". They would like perform the task in a development environment where they can test whether dependent entities are affected by the change to the table. Therefore they create a branch of the table "table_staging" where they drop the column and a branch of the view "view_cleaned" where they adjust the query definition such that everything is working as expected.

I'm not sure if this kind of usage was envisioned when table branching was added to iceberg tables, but it is possible.

Thanks,

Jan

On 14.11.23 09:04, Walaa Eldin Moustafa wrote:
Also, view metadata versions and (underlying) table snapshots/versions are orthogonal concepts. For example, theoretically, one could time-travel in views along two dimensions: view metadata version and underlying data version. Hence, I do not think that data versioning in tables corresponds exactly to view metadata versioning. Instead of mapping/porting the feature from tables to views, we can approach this by discussing the use case we are trying to unlock with this proposal. Maybe there is a better way to support the use case.

Thanks,
Walaa.

On Mon, Nov 13, 2023 at 11:47 PM Ajantha Bhat <ajanthab...@gmail.com> wrote:

    Hi Jan,

    In my view, branches are primarily intended for isolating tests
    and later merging them back (commonly referred to as the WAP
    scenario).
    Tags, conversely, serve the purpose of marking significant
    snapshots for reproducibility or auditing.

    Views essentially act as a shorthand for queries. Creating or
    replacing a view is a metadata operation with no data involvement.
    So, branching and tagging support may be an overkill.

    However, when dealing with materialized views, it becomes crucial
    to support branching and tagging, given that these operations
    involve data manipulation.

    Thanks,

    Ajantha

Reply via email to