Thanks for the reply Ryan, and thanks for the additional information Mayur!
Yes, the example you described is one of the biggest reasons for using the snapshot schema. I just brought the other example up because that was the conflicting use case I heard. The use case is that the system would always query a stable snapshot by referring to a specific snapshot ID. And it could perform a much faster rollback to an older version of the snapshot instead of rolling back the entire app if something is wrong with the new data. I agree that adding an option to use the current schema sounds like the approach to go for that specific use case, my proposal was trying to have a fix that also works for the second issue. But if we are considering committing a snapshot for a metadata operation, then that would be the best way to solve the second issue and we are good here. Best, Jack Ye On Mon, Nov 22, 2021 at 3:41 PM Ryan Blue <b...@tabular.io> wrote: > Thanks, Jack! I was just thinking about this because I think we want to > have support for time travel with the snapshot schema in 0.13.0 if > possible, so it's a great time to talk about the behavior. > > In general, I think that time travel with the snapshot schema is the right > option. The use case I think of is where I am trying to debug what was seen > at some time -- either for a user to understand why a query produced a > certain result, or maybe for an administrator to reproduce a failure case. > Those cases require as close as we can get to exactly what was used. > > To get more specific, say I have a table where I replace a column with a > new one using the same name. And then I want to debug a query that happened > at some time. If I use the current schema for that time travel, then I'll > get completely different results -- probably no rows if I'm filtering or > joining by the replaced column. That's really confusing behavior. Maybe > there are some cases where we want to use the current schema, but I think > users implicitly expect time travel to give the same results as a > historical query. > > Your example may justify adding an option to time travel with the current > schema, but I'm not sure when you'd have a production app that you use with > time travel. Wouldn't you also roll back to an older version of the query? > I just don't see myself being surprised if an updated query doesn't work. > > For the second issue, I agree with you that the current behavior is weird. > But I think the best way to fix it is to simply create a new snapshot at > the same time we update the schema. If we create a snapshot with the same > manifest list, then we can do everything entirely in metadata and avoid the > behavior. We can do this with a quick modification to the TableMetadata > mutation. > > If we solve the second case by adding a rewrite snapshot with no > modifications, then I don't think we need to specify a schema. Instead, we > can add a flag for whether to time travel with the current or snapshot > schema. But, I'd wait until we hit a use case that requires this, since I'm > skeptical about people needing it in practice. > > Ryan > > On Mon, Nov 22, 2021 at 11:52 AM Jack Ye <yezhao...@gmail.com> wrote: > >> Hi everyone, >> >> As we are starting to have more use cases around the time travel related >> features in Iceberg, a few questions popped up around the correct schema to >> use for time travel queries. >> >> *Latest table schema vs schema at snapshot time* >> There was a discussion around this in >> https://github.com/apache/iceberg/pull/1508, but I would like to just >> revisit it a bit, because in many applications the schema at read time is >> fixed, and the same schema might be desirable for querying an old snapshot. >> Considering the following case: >> >> t0: create table sample (id bigint, category string, data string) >> t1: insert into sample (1, 'c1', data1'), (2, 'c2', 'data2') >> t2: alter table sample add column ts timestamp >> t3: insert into sample (3, 'c3', 'dat3', timestamp '2000-01-01') >> >> The application that reads the table might already be updated to query >> columns (id, data, ts), and just expect the value of ts to be null for data >> before t2 when it performs a time travel. However, if time travel to t1 >> gives data in the old schema, it will not find the ts column and fail the >> query. So things get a lot more complex at application side because it >> needs to distinguish between the query as-of-time and use different logic >> to handle data. This gets very convoluted very quickly as more schema >> evolutions are added. >> >> While I do see the great value of using schema at snapshot time in many >> applications and I am all for this feature, I think for some applications >> that means we are pushing a lot of complexity of handling multi-schema to >> the users. Maybe some kind of option is needed for users to better control >> the behavior to fit their specific use cases. >> >> *Schema for time-based travel* >> As of today, we have 2 time travel syntaxes (use Hive syntax as example): >> - SELECT * FROM table FOR SYSTEM_TIME AS OF timestamp >> - SELECT * FROM table FOR SYSTEM_VERSION AS OF version >> >> For SYSTEM_VERSION based travel, it is pretty clear that we will time >> travel to a specific snapshot based on the snapshot ID, and the schema can >> be resolved by the solution of the last question. However, for SYSTEM_TIME >> based travel, there is ambiguity around the exact schema to use, because >> Iceberg does not create a new snapshot for table metadata operations like >> schema evolution. >> >> Consider the same use case example as above, If the user queries SELECT * >> FROM table FOR SYSTEM_TIME AS OF t2, then it is ambiguous if we should use >> the schema of the snapshot at t1, or the updated schema at t2. Currently >> the behavior is to use a schema of snapshot at t1, which might not be the >> intention of the user. >> >> *Thoughts about the solution* >> Considering the 2 cases above, it seems to me that the most flexible way >> to resolve this is to provide users a way to override the schema to use for >> time travel queries. The default can be kept as using the snapshot schema, >> but another schema could be used if the user specifies the schema ID. This >> provides a lot of flexibility for use cases like table migration and schema >> evolution coordination across microservices. >> >> Any thoughts around this? >> >> Best, >> Jack Ye >> >> >> > > -- > Ryan Blue > Tabular >