[ https://issues.apache.org/jira/browse/HIVE-26151?focusedWorklogId=759066&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-759066 ]
ASF GitHub Bot logged work on HIVE-26151: ----------------------------------------- Author: ASF GitHub Bot Created on: 20/Apr/22 09:26 Start Date: 20/Apr/22 09:26 Worklog Time Spent: 10m Work Description: marton-bod commented on code in PR #3222: URL: https://github.com/apache/hive/pull/3222#discussion_r853924488 ########## iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java: ########## @@ -207,6 +218,39 @@ public RecordReader<Void, T> createRecordReader(InputSplit split, TaskAttemptCon return new IcebergRecordReader<>(); } + private static TableScan scanWithTimeRange(Table table, Configuration conf, TableScan scan, long fromTime) { + // let's find the corresponding snapshot ID - if the fromTime is before the table creation happened, let's use + // the first snapshot of the table + long fromSnapshot = IcebergTableUtil.findSnapshotForTimestamp(table, fromTime) + .orElseGet(() -> table.history().get(0).snapshotId()); + if (fromSnapshot == table.currentSnapshot().snapshotId()) { + throw new IllegalArgumentException( + "Provided FROM timestamp must be earlier than the latest snapshot of the table."); + } + long toTime = conf.getLong(InputFormatConfig.TO_TIMESTAMP, -1); Review Comment: Sure ########## iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java: ########## @@ -207,6 +218,39 @@ public RecordReader<Void, T> createRecordReader(InputSplit split, TaskAttemptCon return new IcebergRecordReader<>(); } + private static TableScan scanWithTimeRange(Table table, Configuration conf, TableScan scan, long fromTime) { + // let's find the corresponding snapshot ID - if the fromTime is before the table creation happened, let's use + // the first snapshot of the table + long fromSnapshot = IcebergTableUtil.findSnapshotForTimestamp(table, fromTime) + .orElseGet(() -> table.history().get(0).snapshotId()); + if (fromSnapshot == table.currentSnapshot().snapshotId()) { + throw new IllegalArgumentException( + "Provided FROM timestamp must be earlier than the latest snapshot of the table."); + } + long toTime = conf.getLong(InputFormatConfig.TO_TIMESTAMP, -1); + if (toTime != -1) { + if (fromTime >= toTime) { Review Comment: Yep, makes sense Issue Time Tracking ------------------- Worklog Id: (was: 759066) Time Spent: 40m (was: 0.5h) > Support range-based time travel queries for Iceberg > --------------------------------------------------- > > Key: HIVE-26151 > URL: https://issues.apache.org/jira/browse/HIVE-26151 > Project: Hive > Issue Type: New Feature > Reporter: Marton Bod > Assignee: Marton Bod > Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Allow querying which records have been inserted during a certain time window > for Iceberg tables. The Iceberg TableScan API provides an implementation for > that, so most of the work would go into adding syntax support and > transporting the startTime and endTime parameters to the Iceberg input format. > Proposed new syntax: > SELECT * FROM table FOR SYSTEM_TIME FROM '<startTime>' TO '<endTime>' > SELECT * FROM table FOR SYSTEM_VERSION FROM <startVersion> TO <endVersion> > (the TO clause is optional in both cases) -- This message was sent by Atlassian Jira (v8.20.7#820007)