Iceberg supports branching so that you can safely perform such tests without any risk of corrupting the table. No need to create a separate table and clone the config. Overall, I don’t think it is a good idea to break the contract of CREATE TABLE LIKE.
- Anton > On Apr 27, 2023, at 11:59 AM, Pucheng Yang <py...@pinterest.com.INVALID> > wrote: > > Hi Anton, > > Yes, I want to branch the table state and reuse the data files, but for test > purposes only. Imagine if we want to test something related to reading the > Iceberg table or perform row level update. > > And I acknowledge the potential risk of the table state being corrupted. So I > am thinking we can consider adding these limitations when running the "create > table like": > (1) the created table should have "snapshot=true" > (2) the created table should have "gc.enabled=false" to make sure existing > files don't get messed up > (3) the created table should have a table location different then the > existing Iceberg table location it creates from > We can consider "create table like" as a snapshot action for an existing > Iceberg table, similar to the existing snapshot procedure we have for an > existing Hive table. > > I know CREATE TABLE LIKE is supposed to be copy reuse existing table > definition only. If we have concerns around messing up table state, I wish we > can break it down into the implementation and at least first implement the > part where we create tables without reusing the existing data files. > > On Wed, Apr 26, 2023 at 8:26 AM Anton Okolnychyi > <aokolnyc...@apple.com.invalid> wrote: > Pucheng, you mentioned you want to reuse existing data in the new table? > Branching Iceberg table state can lead to unexpected situations as there will > be multiple pointers in the catalog to the same state, which can eventually > corrupt the table. Isn’t CREATE TABLE LIKE supposed to just reuse the > existing table definition without copying the data? > > - Anton > >> On Apr 26, 2023, at 5:41 AM, Zoltán Borók-Nagy <borokna...@apache.org >> <mailto:borokna...@apache.org>> wrote: >> >> As a reference, Impala can also do Hive-style CREATE TABLE x LIKE y for >> Iceberg tables. >> You can see various examples at >> https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/iceberg-create-table-like-table.test >> >> <https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/iceberg-create-table-like-table.test> >> >> - Zoltan >> >> On Wed, Apr 26, 2023 at 4:10 AM Ryan Blue <b...@tabular.io >> <mailto:b...@tabular.io>> wrote: >> You should be able to see how other DSv2 commands are written and copy them. >> Look at Drop Table, maybe and see if you can copy the structure, but instead >> of dropping, load the table and call createTable with its metadata. >> >> On Tue, Apr 25, 2023 at 4:42 PM Pucheng Yang <py...@pinterest.com.invalid >> <mailto:py...@pinterest.com.invalid>> wrote: >> Thanks Steve and Ryan for the reply. >> >> Steve, I am not looking for CTAS, my goal is to create an Iceberg table and >> reuse the existing data (same as the create table like statement above). >> Also my question is not about specifying location in create statement. >> >> Ryan, the engine we are interested in is SparkSQL. Since you mentioned it is >> an easy fix, would you please share how that should be implemented such that >> anyone (maybe myself) interested in this can explore the solution? >> >> Thanks both again. >> >> On Tue, Apr 25, 2023 at 4:07 PM Ryan Blue <b...@tabular.io >> <mailto:b...@tabular.io>> wrote: >> Pucheng, what engine are you interested in? >> >> This works fine in Trino: CREATE TABLE table_copy (LIKE source_table >> INCLUDING PROPERTIES) >> >> I don’t know if it works in Hive, and last time I checked it was not >> implemented for DSv2 in Spark. The Spark problem should be an easy fix. >> >> Ryan >> >> >> On Tue, Apr 25, 2023 at 2:43 PM Steve Zhang <hongyue_zh...@apple.com.invalid >> <mailto:hongyue_zh...@apple.com.invalid>> wrote: >> Hey Pengcheng, >> >> Are you looking for CTAS as in >> https://iceberg.apache.org/docs/latest/spark-ddl/#create-table--as-select? >> <https://iceberg.apache.org/docs/latest/spark-ddl/#create-table--as-select?> >> I think you can also specify explicit location as part of create statement >> in https://iceberg.apache.org/docs/latest/spark-ddl/#create-table >> <https://iceberg.apache.org/docs/latest/spark-ddl/#create-table> >> >> Thanks, >> Steve Zhang >> >> >> >>> On Apr 25, 2023, at 1:46 PM, Pucheng Yang <py...@pinterest.com.INVALID >>> <mailto:py...@pinterest.com.INVALID>> wrote: >>> >>> Hi all, >>> >>> I wonder how folks in the community deal with the cases where you want to >>> create a test table from an existing iceberg table? In Hive, what we >>> normally do is to run a query "create table x like y location z". But we >>> can't do this for the Iceberg table. >>> >>> If this is a feature that is missing, should we collaborate to build a >>> similar feature? >>> >>> Thanks >> >> >> >> -- >> Ryan Blue >> Tabular >> >> >> -- >> Ryan Blue >> Tabular >