[ 
https://issues.apache.org/jira/browse/HIVE-26151?focusedWorklogId=759066&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-759066
 ]

ASF GitHub Bot logged work on HIVE-26151:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 20/Apr/22 09:26
            Start Date: 20/Apr/22 09:26
    Worklog Time Spent: 10m 
      Work Description: marton-bod commented on code in PR #3222:
URL: https://github.com/apache/hive/pull/3222#discussion_r853924488


##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java:
##########
@@ -207,6 +218,39 @@ public RecordReader<Void, T> createRecordReader(InputSplit 
split, TaskAttemptCon
     return new IcebergRecordReader<>();
   }
 
+  private static TableScan scanWithTimeRange(Table table, Configuration conf, 
TableScan scan, long fromTime) {
+    // let's find the corresponding snapshot ID - if the fromTime is before 
the table creation happened, let's use
+    // the first snapshot of the table
+    long fromSnapshot = IcebergTableUtil.findSnapshotForTimestamp(table, 
fromTime)
+        .orElseGet(() -> table.history().get(0).snapshotId());
+    if (fromSnapshot == table.currentSnapshot().snapshotId()) {
+      throw new IllegalArgumentException(
+          "Provided FROM timestamp must be earlier than the latest snapshot of 
the table.");
+    }
+    long toTime = conf.getLong(InputFormatConfig.TO_TIMESTAMP, -1);

Review Comment:
   Sure



##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java:
##########
@@ -207,6 +218,39 @@ public RecordReader<Void, T> createRecordReader(InputSplit 
split, TaskAttemptCon
     return new IcebergRecordReader<>();
   }
 
+  private static TableScan scanWithTimeRange(Table table, Configuration conf, 
TableScan scan, long fromTime) {
+    // let's find the corresponding snapshot ID - if the fromTime is before 
the table creation happened, let's use
+    // the first snapshot of the table
+    long fromSnapshot = IcebergTableUtil.findSnapshotForTimestamp(table, 
fromTime)
+        .orElseGet(() -> table.history().get(0).snapshotId());
+    if (fromSnapshot == table.currentSnapshot().snapshotId()) {
+      throw new IllegalArgumentException(
+          "Provided FROM timestamp must be earlier than the latest snapshot of 
the table.");
+    }
+    long toTime = conf.getLong(InputFormatConfig.TO_TIMESTAMP, -1);
+    if (toTime != -1) {
+      if (fromTime >= toTime) {

Review Comment:
   Yep, makes sense





Issue Time Tracking
-------------------

    Worklog Id:     (was: 759066)
    Time Spent: 40m  (was: 0.5h)

> Support range-based time travel queries for Iceberg
> ---------------------------------------------------
>
>                 Key: HIVE-26151
>                 URL: https://issues.apache.org/jira/browse/HIVE-26151
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Marton Bod
>            Assignee: Marton Bod
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> Allow querying which records have been inserted during a certain time window 
> for Iceberg tables. The Iceberg TableScan API provides an implementation for 
> that, so most of the work would go into adding syntax support and 
> transporting the startTime and endTime parameters to the Iceberg input format.
> Proposed new syntax: 
> SELECT * FROM table FOR SYSTEM_TIME FROM '<startTime>' TO '<endTime>'
> SELECT * FROM table FOR SYSTEM_VERSION FROM <startVersion> TO <endVersion>
> (the TO clause is optional in both cases)



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to