Hi Tianyi, The behavior you found is indeed the current behavior in Iceberg. I too found it unexpected. I have a PR to address this: https://github.com/apache/iceberg/pull/1508. Due to other work, I had not followed up on this for a while, but I am returning to it now. - Wing Yew
On Mon, Dec 14, 2020 at 6:27 AM Cap Kurmagati <[email protected]> wrote: > Hi, > > I have a question regarding the behavior of schema evolution with > time-travel in Iceberg. > When I do a time-travel query against a table with schema changes. > I expect that the result is structured using the schema. But it turned out > to be structured using the current schema. > > Is this an expected behavior? > I think it would be nice to be able to query the data in its original > shape. What do you think? > > Code snippet as follows. Environment: Iceberg 0.10.0, Spark 3.0.1 > > sql("create table iceberg.test.schema_timetravel (id int, name string) > using iceberg") > sql("insert into table iceberg.test.schema_timetravel values(1, 'aaa')") > sql("insert into table iceberg.test.schema_timetravel values(2, 'bbb')") > sql("select * from iceberg.test.schema_timetravel").show() > +---+-------+ > | id| name| > +---+-------+ > | 1| aaa| > | 2| bbb| > +---+-------+ > sql("select * from iceberg.test.schema_timetravel.history").show() > > +--------------------+-------------------+-------------------+-------------------+ > | made_current_at| snapshot_id| > parent_id|is_current_ancestor| > > +--------------------+-------------------+-------------------+-------------------+ > |2020-12-14 22:44:...|2849000299888498484| null| > true| > |2020-12-14 22:44:...|5610242355805640211|2849000299888498484| > true| > > +--------------------+-------------------+-------------------+-------------------+ > sql("alter table iceberg.test.schema_timetravel drop column name") > sql("select * from iceberg.test.schema_timetravel").show() > +---+ > | id| > +---+ > | 1| > | 2| > +---+ > spark.read.format("iceberg").option("snapshot-id", > 2849000299888498484L).load("test.schema_timetravel").show() > // Expect: show data in the previous schema: (1, aaa) > // Result: show data in the current schema: (1) > +---+ > | id| > +---+ > | 1| > +---+ > > Best regards, > Tianyi >
