Re: Read schema to use during time travel

Jack Ye Mon, 22 Nov 2021 16:59:40 -0800

Thanks for the reply Ryan, and thanks for the additional information Mayur!


Yes, the example you described is one of the biggest reasons for using the
snapshot schema. I just brought the other example up because that was the
conflicting use case I heard. The use case is that the system would always
query a stable snapshot by referring to a specific snapshot ID. And it
could perform a much faster rollback to an older version of the snapshot
instead of rolling back the entire app if something is wrong with the new
data.

I agree that adding an option to use the current schema sounds like the
approach to go for that specific use case, my proposal was trying to have a
fix that also works for the second issue. But if we are considering
committing a snapshot for a metadata operation, then that would be the best
way to solve the second issue and we are good here.

Best,
Jack Ye

On Mon, Nov 22, 2021 at 3:41 PM Ryan Blue <b...@tabular.io> wrote:

> Thanks, Jack! I was just thinking about this because I think we want to
> have support for time travel with the snapshot schema in 0.13.0 if
> possible, so it's a great time to talk about the behavior.
>
> In general, I think that time travel with the snapshot schema is the right
> option. The use case I think of is where I am trying to debug what was seen
> at some time -- either for a user to understand why a query produced a
> certain result, or maybe for an administrator to reproduce a failure case.
> Those cases require as close as we can get to exactly what was used.
>
> To get more specific, say I have a table where I replace a column with a
> new one using the same name. And then I want to debug a query that happened
> at some time. If I use the current schema for that time travel, then I'll
> get completely different results -- probably no rows if I'm filtering or
> joining by the replaced column. That's really confusing behavior. Maybe
> there are some cases where we want to use the current schema, but I think
> users implicitly expect time travel to give the same results as a
> historical query.
>
> Your example may justify adding an option to time travel with the current
> schema, but I'm not sure when you'd have a production app that you use with
> time travel. Wouldn't you also roll back to an older version of the query?
> I just don't see myself being surprised if an updated query doesn't work.
>
> For the second issue, I agree with you that the current behavior is weird.
> But I think the best way to fix it is to simply create a new snapshot at
> the same time we update the schema. If we create a snapshot with the same
> manifest list, then we can do everything entirely in metadata and avoid the
> behavior. We can do this with a quick modification to the TableMetadata
> mutation.
>
> If we solve the second case by adding a rewrite snapshot with no
> modifications, then I don't think we need to specify a schema. Instead, we
> can add a flag for whether to time travel with the current or snapshot
> schema. But, I'd wait until we hit a use case that requires this, since I'm
> skeptical about people needing it in practice.
>
> Ryan
>
> On Mon, Nov 22, 2021 at 11:52 AM Jack Ye <yezhao...@gmail.com> wrote:
>
>> Hi everyone,
>>
>> As we are starting to have more use cases around the time travel related
>> features in Iceberg, a few questions popped up around the correct schema to
>> use for time travel queries.
>>
>> *Latest table schema vs schema at snapshot time*
>> There was a discussion around this in
>> https://github.com/apache/iceberg/pull/1508, but I would like to just
>> revisit it a bit, because in many applications the schema at read time is
>> fixed, and the same schema might be desirable for querying an old snapshot.
>> Considering the following case:
>>
>> t0: create table sample (id bigint, category string, data string)
>> t1: insert into sample (1, 'c1', data1'), (2, 'c2', 'data2')
>> t2: alter table sample add column ts timestamp
>> t3: insert into sample (3, 'c3', 'dat3', timestamp '2000-01-01')
>>
>> The application that reads the table might already be updated to query
>> columns (id, data, ts), and just expect the value of ts to be null for data
>> before t2 when it performs a time travel. However, if time travel to t1
>> gives data in the old schema, it will not find the ts column and fail the
>> query. So things get a lot more complex at application side because it
>> needs to distinguish between the query as-of-time and use different logic
>> to handle data. This gets very convoluted very quickly as more schema
>> evolutions are added.
>>
>> While I do see the great value of using schema at snapshot time in many
>> applications and I am all for this feature, I think for some applications
>> that means we are pushing a lot of complexity of handling multi-schema to
>> the users. Maybe some kind of option is needed for users to better control
>> the behavior to fit their specific use cases.
>>
>> *Schema for time-based travel*
>> As of today, we have 2 time travel syntaxes (use Hive syntax as example):
>> - SELECT * FROM table FOR SYSTEM_TIME AS OF timestamp
>> - SELECT * FROM table FOR SYSTEM_VERSION AS OF version
>>
>> For SYSTEM_VERSION based travel, it is pretty clear that we will time
>> travel to a specific snapshot based on the snapshot ID, and the schema can
>> be resolved by the solution of the last question. However, for SYSTEM_TIME
>> based travel, there is ambiguity around the exact schema to use, because
>> Iceberg does not create a new snapshot for table metadata operations like
>> schema evolution.
>>
>> Consider the same use case example as above, If the user queries SELECT *
>> FROM table FOR SYSTEM_TIME AS OF t2, then it is ambiguous if we should use
>> the schema of the snapshot at t1, or the updated schema at t2. Currently
>> the behavior is to use a schema of snapshot at t1, which might not be the
>> intention of the user.
>>
>> *Thoughts about the solution*
>> Considering the 2 cases above, it seems to me that the most flexible way
>> to resolve this is to provide users a way to override the schema to use for
>> time travel queries. The default can be kept as using the snapshot schema,
>> but another schema could be used if the user specifies the schema ID. This
>> provides a lot of flexibility for use cases like table migration and schema
>> evolution coordination across microservices.
>>
>> Any thoughts around this?
>>
>> Best,
>> Jack Ye
>>
>>
>>
>
> --
> Ryan Blue
> Tabular
>

Re: Read schema to use during time travel

Reply via email to