Hi, Sure, will do.
*Vladimir Ozerov* Founder querifylabs.com Ср, 23 окт. 2024 г. в 08:50, Jean-Baptiste Onofré <j...@nanthrax.net>: > I second Ryan here, it would be great to clarify in the > "implementation notes" section. > > Thanks ! > Regards > JB > > On Wed, Oct 23, 2024 at 1:10 AM rdb...@gmail.com <rdb...@gmail.com> wrote: > > > > Thanks Vladimir! Would you like to open a PR to make that change? It > sounds like another good item to put into the "Implementation notes" > section. > > > > On Sun, Oct 20, 2024 at 11:41 PM Vladimir Ozerov < > voze...@querifylabs.com> wrote: > >> > >> Hi Jean-Baptiste, > >> > >> Agreed. REST spec looks good. I am talking about the general spec, > where it might be useful to add a hint to engine developers, that CREATE OR > REPLACE semantics in Iceberg is expected to follow slightly different > semantics. This is already broken in Trino: depending on catalog type users > may get either classical "DROP + CREATE" (for non-REST catalogs), or > "CREATE AND UPDATE" for REST catalog. For Flink, their official docs say > that CREATE OR REPLACE == DROP + CREATE, while for Iceberg tables this > should not be the case. These are definitively things that should be fixed > at engine levels. But at the same time it highlights that engine developers > are having hard time defining proper semantics for CREATE OR REPLACE in the > Iceberg integrations, so a paragraph or so in the main Iceberg spec may > help us align expectations. > >> > >> Regards, > >> Vladimir. > >> > >> On Mon, Oct 21, 2024 at 8:28 AM Jean-Baptiste Onofré <j...@nanthrax.net> > wrote: > >>> > >>> Hi Vladimir, > >>> > >>> As Ryan said, it's not a bug: CREATE OR REPLACE can be seen as "CREATE > >>> AND UPDATE" from table format perspective. Specifically for the > >>> properties, it makes sense to not delete the current properties as it > >>> can be used in several use cases (security, tables grouping, ...). > >>> I'm not sure a REST Spec update is required, probably more on the > >>> engine side. In the REST Spec, you can create a table > >>> ( > https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L553 > ) > >>> and update a table > >>> ( > https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L975 > ), > >>> and it's up to the query engine to implement the "CREATE OR REPLACE" > >>> with the correct semantic. > >>> > >>> Regards > >>> JB > >>> > >>> On Sun, Oct 20, 2024 at 9:26 PM Vladimir Ozerov < > voze...@querifylabs.com> wrote: > >>> > > >>> > Hi Ryan, > >>> > > >>> > Thanks for the clarification. Yes, I think my confusion was caused > by the fact that many engines treat CREATE OR REPLACE as a semantic > equivalent of DROP + CREATE, which is performed atomically (e.g., Flink > [1]). Table formats add history on top of that, which is expected to be > retained, no questions here. Permission propagation also make sense. For > properties things become a bit blurry, because on the one hand there are > Iceberg specific properties, which may affect table maintenance, and on the > other hand there are user-defined properties in the same bag. The question > appeared in the first place because I observed a discrepancy in Trino: all > catalogs except for REST completely overrides table properties on REPLACE, > and REST catalog merges them, which might be confusing to end users. > Perhaps some clarification at the spec level might be useful, because > without agreement between engines the could be some subtle bugs in > multi-engine environments, such as sudden data format changes between > replaces, etc. > >>> > > >>> > [1] > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#create-or-replace-table > >>> > > >>> > Regards, > >>> > Vladimir. > >>> > > >>> > On Sun, Oct 20, 2024 at 9:20 PM rdb...@gmail.com <rdb...@gmail.com> > wrote: > >>> >> > >>> >> Hi Vladimir, > >>> >> > >>> >> This isn't a bug. The behavior of CREATE OR REPLACE is to replace > the data of a table, but to maintain things like other refs, snapshot > history, permissions (if supported by the catalog), and table properties. > Table properties are replaced if they are set in the operation like `b` in > your example. This is not the same as a drop and create, which may be what > you want instead. > >>> >> > >>> >> The reason for this behavior is that the CREATE OR REPLACE > operation is used to replace a table's data without needing to handle > schema changes between versions. For example, producing a daily report > table that replaces the previous day. However, the table still exists and > it is valuable to be able to time travel to older versions or to be able to > use branches and tags. Clearly, that means that table history and refs > stick around so the table is not completely new every time it is replaced. > >>> >> > >>> >> Adding on to that, properties control things like ref and snapshot > retention, file format, compression, and other settings. These aren't > settings that need to be carried through in every replace operation. And it > would make no sense if you set the snapshot retention because older > snapshots are retained, only to have it discarded the next time you replace > the table data. A good way to think about this is that table properties are > set infrequently, while table data changes regularly. And the person > changing the data may not be the person tuning the table settings. > >>> >> > >>> >> Hopefully that helps, > >>> >> > >>> >> Ryan > >>> >> > >>> >> On Sun, Oct 20, 2024 at 9:45 AM Vladimir Ozerov < > voze...@querifylabs.com> wrote: > >>> >>> > >>> >>> Hi, > >>> >>> > >>> >>> Consider a REST catalog and a user calls "CREATE OR REPLACE > <table>" command. When processing the command, engines will usually > initiate a "createOrReplace" transaction and add metadata, such as the > properties of a new table. > >>> >>> > >>> >>> Users expect a table to be replaced with a new one if it exists, > including properties. However, I observe the following: > >>> >>> > >>> >>> RESTSessionCatalog loads previous table metadata, adds new > properties (MetadataUpdate.SetProperties), and invokes the backend > >>> >>> The backend (e.g., Polaris) will typically invoke > "CatalogHandler.updateTable." There, the previous table state, including > its properties, is loaded > >>> >>> Finally, metadata updates are applied, and old table properties > are merged with new ones. That is, if the old table has properties [a=1, > b=2], and the new table has properties [b=3, c=4], then the final > properties would be [a=1, b=3, c=4], while the user expects [b=3, c=4]. > >>> >>> > >>> >>> It looks like a bug because the user expects complete property > replacement instead of a merge. Shall we explicitly clear all previous > properties in RESTSessionCatalog.Builder.replaceTransaction? > >>> >>> > >>> >>> Regards, > >>> >>> Vladimir. > >>> >>> > >>> >>> > >>> >>> > >>> >>> -- > >>> >>> Vladimir Ozerov > >>> >>> Founder > >>> >>> querifylabs.com > >>> > > >>> > > >>> > > >>> > -- > >>> > Vladimir Ozerov > >>> > Founder > >>> > querifylabs.com > >> > >> > >> > >> -- > >> Vladimir Ozerov > >> Founder > >> querifylabs.com >