Hi, Jan:

Thanks for raising this. I think this case is not only branching/tagging of
view, rather branching/tagging of catalog?

On Tue, Nov 14, 2023 at 6:10 PM Jan Kaul <jank...@mailbox.org.invalid>
wrote:

> Thank you for your comments. I should have provided a user story to make
> the use case more clear.
>
> While the WAP pattern is probably the most common usage for the branching
> feature of iceberg tables, it could also be used in different ways. The
> following is a user story showcasing the branching for views and iceberg
> tables without the WAP pattern.
>
> User story:
>
> There is an existing data pipeline that ingests data from an operational
> system into an iceberg table called "table_staging". The table
> "table_staging" is used in the query definition of a view called
> "view_cleaned".  A data engineer is given the task to drop a column of the
> table "table_staging". They would like perform the task in a development
> environment where they can test whether dependent entities are affected by
> the change to the table. Therefore they create a branch of the table
> "table_staging" where they drop the column and a branch of the view
> "view_cleaned" where they adjust the query definition such that everything
> is working as expected.
>
> I'm not sure if this kind of usage was envisioned when table branching was
> added to iceberg tables, but it is possible.
>
> Thanks,
>
> Jan
> On 14.11.23 09:04, Walaa Eldin Moustafa wrote:
>
> Also, view metadata versions and (underlying) table snapshots/versions are
> orthogonal concepts. For example, theoretically, one could time-travel in
> views along two dimensions: view metadata version and underlying data
> version. Hence, I do not think that data versioning in tables corresponds
> exactly to view metadata versioning. Instead of mapping/porting the feature
> from tables to views, we can approach this by discussing the use case we
> are trying to unlock with this proposal. Maybe there is a better way to
> support the use case.
>
> Thanks,
> Walaa.
>
> On Mon, Nov 13, 2023 at 11:47 PM Ajantha Bhat <ajanthab...@gmail.com>
> wrote:
>
>> Hi Jan,
>>
>> In my view, branches are primarily intended for isolating tests and later
>> merging them back (commonly referred to as the WAP scenario).
>> Tags, conversely, serve the purpose of marking significant snapshots for
>> reproducibility or auditing.
>>
>> Views essentially act as a shorthand for queries. Creating or replacing a
>> view is a metadata operation with no data involvement.
>> So, branching and tagging support may be an overkill.
>>
>> However, when dealing with materialized views, it becomes crucial to
>> support branching and tagging, given that these operations involve data
>> manipulation.
>>
>> Thanks,
>>
>> Ajantha
>>
>

Reply via email to