Thanks, Jack! I was just thinking about this because I think we want to
have support for time travel with the snapshot schema in 0.13.0 if
possible, so it's a great time to talk about the behavior.

In general, I think that time travel with the snapshot schema is the right
option. The use case I think of is where I am trying to debug what was seen
at some time -- either for a user to understand why a query produced a
certain result, or maybe for an administrator to reproduce a failure case.
Those cases require as close as we can get to exactly what was used.

To get more specific, say I have a table where I replace a column with a
new one using the same name. And then I want to debug a query that happened
at some time. If I use the current schema for that time travel, then I'll
get completely different results -- probably no rows if I'm filtering or
joining by the replaced column. That's really confusing behavior. Maybe
there are some cases where we want to use the current schema, but I think
users implicitly expect time travel to give the same results as a
historical query.

Your example may justify adding an option to time travel with the current
schema, but I'm not sure when you'd have a production app that you use with
time travel. Wouldn't you also roll back to an older version of the query?
I just don't see myself being surprised if an updated query doesn't work.

For the second issue, I agree with you that the current behavior is weird.
But I think the best way to fix it is to simply create a new snapshot at
the same time we update the schema. If we create a snapshot with the same
manifest list, then we can do everything entirely in metadata and avoid the
behavior. We can do this with a quick modification to the TableMetadata
mutation.

If we solve the second case by adding a rewrite snapshot with no
modifications, then I don't think we need to specify a schema. Instead, we
can add a flag for whether to time travel with the current or snapshot
schema. But, I'd wait until we hit a use case that requires this, since I'm
skeptical about people needing it in practice.

Ryan

On Mon, Nov 22, 2021 at 11:52 AM Jack Ye <yezhao...@gmail.com> wrote:

> Hi everyone,
>
> As we are starting to have more use cases around the time travel related
> features in Iceberg, a few questions popped up around the correct schema to
> use for time travel queries.
>
> *Latest table schema vs schema at snapshot time*
> There was a discussion around this in
> https://github.com/apache/iceberg/pull/1508, but I would like to just
> revisit it a bit, because in many applications the schema at read time is
> fixed, and the same schema might be desirable for querying an old snapshot.
> Considering the following case:
>
> t0: create table sample (id bigint, category string, data string)
> t1: insert into sample (1, 'c1', data1'), (2, 'c2', 'data2')
> t2: alter table sample add column ts timestamp
> t3: insert into sample (3, 'c3', 'dat3', timestamp '2000-01-01')
>
> The application that reads the table might already be updated to query
> columns (id, data, ts), and just expect the value of ts to be null for data
> before t2 when it performs a time travel. However, if time travel to t1
> gives data in the old schema, it will not find the ts column and fail the
> query. So things get a lot more complex at application side because it
> needs to distinguish between the query as-of-time and use different logic
> to handle data. This gets very convoluted very quickly as more schema
> evolutions are added.
>
> While I do see the great value of using schema at snapshot time in many
> applications and I am all for this feature, I think for some applications
> that means we are pushing a lot of complexity of handling multi-schema to
> the users. Maybe some kind of option is needed for users to better control
> the behavior to fit their specific use cases.
>
> *Schema for time-based travel*
> As of today, we have 2 time travel syntaxes (use Hive syntax as example):
> - SELECT * FROM table FOR SYSTEM_TIME AS OF timestamp
> - SELECT * FROM table FOR SYSTEM_VERSION AS OF version
>
> For SYSTEM_VERSION based travel, it is pretty clear that we will time
> travel to a specific snapshot based on the snapshot ID, and the schema can
> be resolved by the solution of the last question. However, for SYSTEM_TIME
> based travel, there is ambiguity around the exact schema to use, because
> Iceberg does not create a new snapshot for table metadata operations like
> schema evolution.
>
> Consider the same use case example as above, If the user queries SELECT *
> FROM table FOR SYSTEM_TIME AS OF t2, then it is ambiguous if we should use
> the schema of the snapshot at t1, or the updated schema at t2. Currently
> the behavior is to use a schema of snapshot at t1, which might not be the
> intention of the user.
>
> *Thoughts about the solution*
> Considering the 2 cases above, it seems to me that the most flexible way
> to resolve this is to provide users a way to override the schema to use for
> time travel queries. The default can be kept as using the snapshot schema,
> but another schema could be used if the user specifies the schema ID. This
> provides a lot of flexibility for use cases like table migration and schema
> evolution coordination across microservices.
>
> Any thoughts around this?
>
> Best,
> Jack Ye
>
>
>

-- 
Ryan Blue
Tabular

Reply via email to