Hi,

Sure, will do.

*Vladimir Ozerov*
Founder
querifylabs.com


Ср, 23 окт. 2024 г. в 08:50, Jean-Baptiste Onofré <j...@nanthrax.net>:

> I second Ryan here, it would be great to clarify in the
> "implementation notes" section.
>
> Thanks !
> Regards
> JB
>
> On Wed, Oct 23, 2024 at 1:10 AM rdb...@gmail.com <rdb...@gmail.com> wrote:
> >
> > Thanks Vladimir! Would you like to open a PR to make that change? It
> sounds like another good item to put into the "Implementation notes"
> section.
> >
> > On Sun, Oct 20, 2024 at 11:41 PM Vladimir Ozerov <
> voze...@querifylabs.com> wrote:
> >>
> >> Hi Jean-Baptiste,
> >>
> >> Agreed. REST spec looks good. I am talking about the general spec,
> where it might be useful to add a hint to engine developers, that CREATE OR
> REPLACE semantics in Iceberg is expected to follow slightly different
> semantics. This is already broken in Trino: depending on catalog type users
> may get either classical "DROP + CREATE" (for non-REST catalogs), or
> "CREATE AND UPDATE" for REST catalog. For Flink, their official docs say
> that CREATE OR REPLACE == DROP + CREATE, while for Iceberg tables this
> should not be the case. These are definitively things that should be fixed
> at engine levels. But at the same time it highlights that engine developers
> are having hard time defining proper semantics for CREATE OR REPLACE in the
> Iceberg integrations, so a paragraph or so in the main Iceberg spec may
> help us align expectations.
> >>
> >> Regards,
> >> Vladimir.
> >>
> >> On Mon, Oct 21, 2024 at 8:28 AM Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
> >>>
> >>> Hi Vladimir,
> >>>
> >>> As Ryan said, it's not a bug: CREATE OR REPLACE can be seen as "CREATE
> >>> AND UPDATE" from table format perspective. Specifically for the
> >>> properties, it makes sense to not delete the current properties as it
> >>> can be used in several use cases (security, tables grouping, ...).
> >>> I'm not sure a REST Spec update is required, probably more on the
> >>> engine side. In the REST Spec, you can create a table
> >>> (
> https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L553
> )
> >>> and update a table
> >>> (
> https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L975
> ),
> >>> and it's up to the query engine to implement the "CREATE OR REPLACE"
> >>> with the correct semantic.
> >>>
> >>> Regards
> >>> JB
> >>>
> >>> On Sun, Oct 20, 2024 at 9:26 PM Vladimir Ozerov <
> voze...@querifylabs.com> wrote:
> >>> >
> >>> > Hi Ryan,
> >>> >
> >>> > Thanks for the clarification. Yes, I think my confusion was caused
> by the fact that many engines treat CREATE OR REPLACE as a semantic
> equivalent of DROP + CREATE, which is performed atomically (e.g., Flink
> [1]). Table formats add history on top of that, which is expected to be
> retained, no questions here. Permission propagation also make sense. For
> properties things become a bit blurry, because on the one hand there are
> Iceberg specific properties, which may affect table maintenance, and on the
> other hand there are user-defined properties in the same bag. The question
> appeared in the first place because I observed a discrepancy in Trino: all
> catalogs except for REST completely overrides table properties on REPLACE,
> and REST catalog merges them, which might be confusing to end users.
> Perhaps some clarification at the spec level might be useful, because
> without agreement between engines the could be some subtle bugs in
> multi-engine environments, such as sudden data format changes between
> replaces, etc.
> >>> >
> >>> > [1]
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#create-or-replace-table
> >>> >
> >>> > Regards,
> >>> > Vladimir.
> >>> >
> >>> > On Sun, Oct 20, 2024 at 9:20 PM rdb...@gmail.com <rdb...@gmail.com>
> wrote:
> >>> >>
> >>> >> Hi Vladimir,
> >>> >>
> >>> >> This isn't a bug. The behavior of CREATE OR REPLACE is to replace
> the data of a table, but to maintain things like other refs, snapshot
> history, permissions (if supported by the catalog), and table properties.
> Table properties are replaced if they are set in the operation like `b` in
> your example. This is not the same as a drop and create, which may be what
> you want instead.
> >>> >>
> >>> >> The reason for this behavior is that the CREATE OR REPLACE
> operation is used to replace a table's data without needing to handle
> schema changes between versions. For example, producing a daily report
> table that replaces the previous day. However, the table still exists and
> it is valuable to be able to time travel to older versions or to be able to
> use branches and tags. Clearly, that means that table history and refs
> stick around so the table is not completely new every time it is replaced.
> >>> >>
> >>> >> Adding on to that, properties control things like ref and snapshot
> retention, file format, compression, and other settings. These aren't
> settings that need to be carried through in every replace operation. And it
> would make no sense if you set the snapshot retention because older
> snapshots are retained, only to have it discarded the next time you replace
> the table data. A good way to think about this is that table properties are
> set infrequently, while table data changes regularly. And the person
> changing the data may not be the person tuning the table settings.
> >>> >>
> >>> >> Hopefully that helps,
> >>> >>
> >>> >> Ryan
> >>> >>
> >>> >> On Sun, Oct 20, 2024 at 9:45 AM Vladimir Ozerov <
> voze...@querifylabs.com> wrote:
> >>> >>>
> >>> >>> Hi,
> >>> >>>
> >>> >>> Consider a REST catalog and a user calls "CREATE OR REPLACE
> <table>" command. When processing the command, engines will usually
> initiate a "createOrReplace" transaction and add metadata, such as the
> properties of a new table.
> >>> >>>
> >>> >>> Users expect a table to be replaced with a new one if it exists,
> including properties. However, I observe the following:
> >>> >>>
> >>> >>> RESTSessionCatalog loads previous table metadata, adds new
> properties (MetadataUpdate.SetProperties), and invokes the backend
> >>> >>> The backend (e.g., Polaris) will typically invoke
> "CatalogHandler.updateTable." There, the previous table state, including
> its properties, is loaded
> >>> >>> Finally, metadata updates are applied, and old table properties
> are merged with new ones. That is, if the old table has properties [a=1,
> b=2], and the new table has properties [b=3, c=4], then the final
> properties would be [a=1, b=3, c=4], while the user expects [b=3, c=4].
> >>> >>>
> >>> >>> It looks like a bug because the user expects complete property
> replacement instead of a merge. Shall we explicitly clear all previous
> properties in RESTSessionCatalog.Builder.replaceTransaction?
> >>> >>>
> >>> >>> Regards,
> >>> >>> Vladimir.
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> --
> >>> >>> Vladimir Ozerov
> >>> >>> Founder
> >>> >>> querifylabs.com
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > Vladimir Ozerov
> >>> > Founder
> >>> > querifylabs.com
> >>
> >>
> >>
> >> --
> >> Vladimir Ozerov
> >> Founder
> >> querifylabs.com
>

Reply via email to