Pucheng, I think it is reasonable to set up a temporary table with gc.enabled and identical configuration, but that isn't the behavior of CREATE TABLE LIKE in most systems that I'm aware of. I think that command creates a copy with the same metadata and configuration, but not a copy of the data. What you're talking about would probably be something I'd build as a new stored procedure.
On Thu, Apr 27, 2023 at 11:59 AM Pucheng Yang <py...@pinterest.com.invalid> wrote: > Hi Anton, > > Yes, I want to branch the table state and reuse the data files, but for > test purposes only. Imagine if we want to test something related to reading > the Iceberg table or perform row level update. > > And I acknowledge the potential risk of the table state being corrupted. > So I am thinking we can consider adding these limitations when running the > "create table like": > (1) the created table should have "snapshot=true" > (2) the created table should have "gc.enabled=false" to make sure existing > files don't get messed up > (3) the created table should have a table location different then the > existing Iceberg table location it creates from > We can consider "create table like" as a snapshot action for an existing > Iceberg table, similar to the existing snapshot procedure we have for an > existing Hive table. > > I know CREATE TABLE LIKE is supposed to be copy reuse existing table > definition only. If we have concerns around messing up table state, I wish > we can break it down into the implementation and at least first implement > the part where we create tables without reusing the existing data files. > > On Wed, Apr 26, 2023 at 8:26 AM Anton Okolnychyi > <aokolnyc...@apple.com.invalid> wrote: > >> Pucheng, you mentioned you want to reuse existing data in the new table? >> Branching Iceberg table state can lead to unexpected situations as there >> will be multiple pointers in the catalog to the same state, which can >> eventually corrupt the table. Isn’t CREATE TABLE LIKE supposed to just >> reuse the existing table definition without copying the data? >> >> - Anton >> >> On Apr 26, 2023, at 5:41 AM, Zoltán Borók-Nagy <borokna...@apache.org> >> wrote: >> >> As a reference, Impala can also do Hive-style CREATE TABLE x LIKE y for >> Iceberg tables. >> You can see various examples at >> https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/iceberg-create-table-like-table.test >> >> - Zoltan >> >> On Wed, Apr 26, 2023 at 4:10 AM Ryan Blue <b...@tabular.io> wrote: >> >>> You should be able to see how other DSv2 commands are written and copy >>> them. Look at Drop Table, maybe and see if you can copy the structure, but >>> instead of dropping, load the table and call createTable with its metadata. >>> >>> On Tue, Apr 25, 2023 at 4:42 PM Pucheng Yang < >>> py...@pinterest.com.invalid> wrote: >>> >>>> Thanks Steve and Ryan for the reply. >>>> >>>> Steve, I am not looking for CTAS, my goal is to create an Iceberg table >>>> and reuse the existing data (same as the create table like statement >>>> above). Also my question is not about specifying location in >>>> create statement. >>>> >>>> Ryan, the engine we are interested in is SparkSQL. Since you mentioned >>>> it is an easy fix, would you please share how that should be implemented >>>> such that anyone (maybe myself) interested in this can explore the >>>> solution? >>>> >>>> Thanks both again. >>>> >>>> On Tue, Apr 25, 2023 at 4:07 PM Ryan Blue <b...@tabular.io> wrote: >>>> >>>>> Pucheng, what engine are you interested in? >>>>> >>>>> This works fine in Trino: CREATE TABLE table_copy (LIKE source_table >>>>> INCLUDING PROPERTIES) >>>>> >>>>> I don’t know if it works in Hive, and last time I checked it was not >>>>> implemented for DSv2 in Spark. The Spark problem should be an easy fix. >>>>> >>>>> Ryan >>>>> >>>>> On Tue, Apr 25, 2023 at 2:43 PM Steve Zhang < >>>>> hongyue_zh...@apple.com.invalid> wrote: >>>>> >>>>>> Hey Pengcheng, >>>>>> >>>>>> Are you looking for CTAS as in >>>>>> https://iceberg.apache.org/docs/latest/spark-ddl/#create-table--as-select? >>>>>> I >>>>>> think you can also specify explicit location as part of create statement >>>>>> in >>>>>> https://iceberg.apache.org/docs/latest/spark-ddl/#create-table >>>>>> >>>>>> Thanks, >>>>>> Steve Zhang >>>>>> >>>>>> >>>>>> >>>>>> On Apr 25, 2023, at 1:46 PM, Pucheng Yang < >>>>>> py...@pinterest.com.INVALID> wrote: >>>>>> >>>>>> Hi all, >>>>>> >>>>>> I wonder how folks in the community deal with the cases where you >>>>>> want to create a test table from an existing iceberg table? In Hive, what >>>>>> we normally do is to run a query "create table x like y location z". But >>>>>> we >>>>>> can't do this for the Iceberg table. >>>>>> >>>>>> If this is a feature that is missing, should we collaborate to build >>>>>> a similar feature? >>>>>> >>>>>> Thanks >>>>>> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> Ryan Blue >>>>> Tabular >>>>> >>>> >>> >>> -- >>> Ryan Blue >>> Tabular >>> >> >> -- Ryan Blue Tabular