Hi Ryan,

Thanks for the clarification. Yes, I think my confusion was caused by the
fact that many engines treat CREATE OR REPLACE as a semantic equivalent of
DROP + CREATE, which is performed atomically (e.g., Flink [1]). Table
formats add history on top of that, which is expected to be retained, no
questions here. Permission propagation also make sense. For properties
things become a bit blurry, because on the one hand there are Iceberg
specific properties, which may affect table maintenance, and on the other
hand there are user-defined properties in the same bag. The question
appeared in the first place because I observed a discrepancy in Trino: all
catalogs except for REST completely overrides table properties on REPLACE,
and REST catalog merges them, which might be confusing to end users.
Perhaps some clarification at the spec level might be useful, because
without agreement between engines the could be some subtle bugs in
multi-engine environments, such as sudden data format changes between
replaces, etc.

[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#create-or-replace-table

Regards,
Vladimir.

On Sun, Oct 20, 2024 at 9:20 PM rdb...@gmail.com <rdb...@gmail.com> wrote:

> Hi Vladimir,
>
> This isn't a bug. The behavior of CREATE OR REPLACE is to replace the data
> of a table, but to maintain things like other refs, snapshot history,
> permissions (if supported by the catalog), and table properties. Table
> properties are replaced if they are set in the operation like `b` in your
> example. This is not the same as a drop and create, which may be what you
> want instead.
>
> The reason for this behavior is that the CREATE OR REPLACE operation is
> used to replace a table's data without needing to handle schema changes
> between versions. For example, producing a daily report table that replaces
> the previous day. However, the table still exists and it is valuable to be
> able to time travel to older versions or to be able to use branches and
> tags. Clearly, that means that table history and refs stick around so the
> table is not completely new every time it is replaced.
>
> Adding on to that, properties control things like ref and snapshot
> retention, file format, compression, and other settings. These aren't
> settings that need to be carried through in every replace operation. And it
> would make no sense if you set the snapshot retention because older
> snapshots are retained, only to have it discarded the next time you replace
> the table data. A good way to think about this is that table properties are
> set infrequently, while table data changes regularly. And the person
> changing the data may not be the person tuning the table settings.
>
> Hopefully that helps,
>
> Ryan
>
> On Sun, Oct 20, 2024 at 9:45 AM Vladimir Ozerov <voze...@querifylabs.com>
> wrote:
>
>> Hi,
>>
>> Consider a REST catalog and a user calls "CREATE OR REPLACE <table>"
>> command. When processing the command, engines will usually initiate a
>> "createOrReplace" transaction and add metadata, such as the properties of a
>> new table.
>>
>> Users expect a table to be replaced with a new one if it exists,
>> including properties. However, I observe the following:
>>
>>    1. RESTSessionCatalog loads previous table metadata, adds new
>>    properties (MetadataUpdate.SetProperties), and invokes the backend
>>    2. The backend (e.g., Polaris) will typically invoke
>>    "CatalogHandler.updateTable." There, the previous table state, including
>>    its properties, is loaded
>>    3. Finally, metadata updates are applied, and old table properties
>>    are merged with new ones. That is, if the old table has properties [a=1,
>>    b=2], and the new table has properties [b=3, c=4], then the final
>>    properties would be [a=1, b=3, c=4], while the user expects [b=3, c=4].
>>
>> It looks like a bug because the user expects complete property
>> replacement instead of a merge. Shall we explicitly clear all previous
>> properties in RESTSessionCatalog.Builder.replaceTransaction?
>>
>> Regards,
>> Vladimir.
>>
>>
>>
>> --
>> *Vladimir Ozerov*
>> Founder
>> querifylabs.com
>>
>

-- 
*Vladimir Ozerov*
Founder
querifylabs.com

Reply via email to