[ https://issues.apache.org/jira/browse/HIVE-26498?focusedWorklogId=807298&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-807298 ]
ASF GitHub Bot logged work on HIVE-26498: ----------------------------------------- Author: ASF GitHub Bot Created on: 09/Sep/22 07:58 Start Date: 09/Sep/22 07:58 Worklog Time Spent: 10m Work Description: lcspinter commented on code in PR #3552: URL: https://github.com/apache/hive/pull/3552#discussion_r966745463 ########## ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/HiveMaterializedViewUtils.java: ########## @@ -160,11 +181,79 @@ public static Boolean isOutdatedMaterializedView( return false; } + private static Boolean isOutdatedMaterializedView( + MaterializationSnapshot snapshot, Hive db, + Set<TableName> tablesUsed, Table materializedViewTable) throws HiveException { + List<String> tablesUsedNames = tablesUsed.stream() + .map(tableName -> TableName.getDbTable(tableName.getDb(), tableName.getTable())) + .collect(Collectors.toList()); + + Map<String, String> snapshotMap = snapshot.getTableSnapshots(); + if (snapshotMap == null || snapshotMap.isEmpty()) { + LOG.debug("Materialized view " + materializedViewTable.getFullyQualifiedName() + + " ignored for rewriting as we could not obtain current snapshot ids"); + return null; + } + + Set<String> storedTablesUsed = materializedViewTable.getMVMetadata().getSourceTableFullNames(); + for (String fullyQualifiedTableName : tablesUsedNames) { + // Note. If the materialized view does not contain a table that is contained in the query, + // we do not need to check whether that specific table is outdated or not. If a rewriting + // is produced in those cases, it is because that additional table is joined with the + // existing tables with an append-columns only join, i.e., PK-FK + not null. + if (!storedTablesUsed.contains(fullyQualifiedTableName)) { + continue; + } + + Table table = db.getTable(fullyQualifiedTableName); + if (table.getStorageHandler() == null) { + LOG.debug("Materialized view {} ignored for rewriting as we could not storage handler of table {}", + materializedViewTable.getFullyQualifiedName(), fullyQualifiedTableName); + return null; + } + String currentTableSnapshot = table.getStorageHandler().getCurrentSnapshotId(table); + if (isBlank(currentTableSnapshot)) { Review Comment: According to the API contract `@return String representation of the snapshotId or null if not supported.` -1 is not a valid snapshotid. Maybe we should communicate this or the best would be to be in sync with the iceberg API to prevent confusion. If the table is empty return null snapshot. Issue Time Tracking ------------------- Worklog Id: (was: 807298) Time Spent: 4.5h (was: 4h 20m) > Implement MV maintenance with Iceberg sources using full rebuild > ---------------------------------------------------------------- > > Key: HIVE-26498 > URL: https://issues.apache.org/jira/browse/HIVE-26498 > Project: Hive > Issue Type: Sub-task > Components: Materialized views > Reporter: Krisztian Kasa > Assignee: Krisztian Kasa > Priority: Major > Labels: pull-request-available > Time Spent: 4.5h > Remaining Estimate: 0h > > {code} > set hive.support.concurrency=true; > set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; > create external table tbl_ice(a int, b string, c int) stored by iceberg > stored as orc tblproperties ('format-version'='2'); > insert into tbl_ice values (1, 'one', 50), (2, 'two', 51), (3, 'three', 52), > (4, 'four', 53), (5, 'five', 54); > create materialized view mat1 as > select b, c from tbl_ice where c > 52; > insert into tbl_ice values (111, 'one', 55), (333, 'two', 56); > explain cbo > alter materialized view mat1 rebuild; > alter materialized view mat1 rebuild; > {code} > MV full rebuild plan > {code} > CBO PLAN: > HiveProject(b=[$1], c=[$2]) > HiveFilter(condition=[>($2, 52)]) > HiveTableScan(table=[[default, tbl_ice]], table:alias=[tbl_ice]) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)