Hi Iceberg devs,

We have been considering the problem of Time-sliced incremental scan (i.e.,
reading data that is committed between two timestamps), and I ran into this
thread [1] in the Iceberg dev mailing list. The summary of the thread is
that incremental scan should leverage snapshot IDs as opposed to timestamps
since there is no guarantee that commit timestamps are linear.

I wanted to follow up on that discussion to see if folks are open to still
supporting time-sliced incremental scan APIs with the caveat above. The
reasons are many fold:

* Time-slice APIs are more human friendly and simplify the need for a state
store to track last read snapshot IDs. While some sort of state recording
may be required in some cases, it is not always the case, especially if
data consumption happens at regular intervals.

* I understand the caveat discussed in the thread still applies to the
existing "TIMESTAMP AS OF" API, yet the API is supported. By extension, I
think it is fair to extend the support to time-sliced incremental scan.

* Iceberg already provides a deterministic function to translate a
timestamp to a snapshot ID [2]. Since a table can be scanned by snapshot
range, I think it is reasonable to allow scanning by timestamp range, since
the translation mechanism already exists.

* Incremental scan using timestamp range is part of the SQL Standard (e.g.
"VERSIONS BETWEEN SYSTEM TIME ... AND ...") and is supported by some
existing engines. See [3] for SQL Server support.

* Conceptually, it is possible to implement the same query semantics at the
SQL level using existing APIs and operators such as "TIMESTAMP AS OF" and
"EXCEPT" (e.g., by selecting: (T as of timestamp1) EXCEPT (T as of
timestamp2)). It sounds that incremental scan is an optimization to push
the differencing operation to the data source as opposed to letting the
engine deal with it, so it is better to implement the SQL shorthand, and
its corresponding data source optimization.

Therefore, I would like to ask if we can proceed with supporting an API
along the lines of "VERSIONS BETWEEN SYSTEM TIME ... AND ...", and consider
the discussion in [1] as a caveat that folks using this API (or existing
timestamp APIs) have to keep in mind, until the implementation guarantees
linearity at some point in the future.

Thanks,
Walaa.

[1] https://www.mail-archive.com/dev@iceberg.apache.org/msg01504.html
[2]
https://github.com/apache/iceberg/blob/1a11038baaca9f4cf03674654a6d9a85f39a1ce1/core/src/main/java/org/apache/iceberg/util/SnapshotUtil.java#L265
[3]
https://docs.microsoft.com/en-us/sql/relational-databases/tables/temporal-tables?view=sql-server-ver15

Reply via email to