[
https://issues.apache.org/jira/browse/IMPALA-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18080046#comment-18080046
]
ASF subversion and git services commented on IMPALA-14801:
----------------------------------------------------------
Commit fc78ecb81d496581380d10ae749909d53c073dca in impala's branch
refs/heads/master from Mihaly Szjatinya
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=fc78ecb81 ]
IMPALA-14801: Catalog topic update creation can't skip Iceberg tables
This patch extends the skipping mechanism from IMPALA-6671 for Iceberg
tables. It uses the Iceberg table's own tableLock, and the
lastVersionSeenByTopicUpdate mechanism of the underlying HDFS table.
Testing:
Added TestTopicUpdateFrequency::test_topic_updates_*_iceberg in 4
variants
Fixed the original HDFS tests:
1. test_topic_updates_unblock
- By default it was running in Local Catalog mode, which has no effect
for fast DML/DQL queries. Added variants for both Legacy and Local
Catalog mode to demonstrate the case.
- In the blocking scenario there was a missing assert (!), making the
test always pass.
- Fast queries are much faster than stated, which doesn't seem to
matter however for the nature of the test.
- Reduced query delay times
2. test_topic_updates_advance
- The test claimed to test catalog_max_lock_skipped_topic_updates but
experimentally I could see no counter blockings triggered at all
under any configuration.
- Added more parallel threads and reduced
catalog_max_lock_skipped_topic_updates to 2 to reliably trigger the
blockings. Ran 20 times locally to verify.
- Query execution time expectations largely incorrect. Perhaps for
some reason it changed over time.
- Assert expects the max query time to be no more than a predictable
value. Which conceptually makes sense for SYNC_DDL, but I wasn't
able to reliably reproduce the case yet.
- Hence for now at least checking the blockings occurring in catalog
logs.
- Reduced query delay times
3. Removed test_topic_lock_timeout_disabled, it is now covered by one of
the test_topic_updates_unblock variants
Change-Id: I51e46820aaa096f3eb69f4dcf580e49a69d6603d
Reviewed-on: http://gerrit.cloudera.org:8080/24243
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Csaba Ringhofer <[email protected]>
> Catalog topic update creation can't skip Iceberg tables
> -------------------------------------------------------
>
> Key: IMPALA-14801
> URL: https://issues.apache.org/jira/browse/IMPALA-14801
> Project: IMPALA
> Issue Type: Improvement
> Components: Catalog
> Reporter: Csaba Ringhofer
> Assignee: Mihaly Szjatinya
> Priority: Critical
>
> https://github.com/apache/impala/blob/31769a7fb50ae1d6b6d69d366a776df441e00e3a/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L1819
> lockTableAndAddToCatalogDelta() calls lockHdfsTblWithTimeout() only for
> HdfsTable, which means that for IcerbergTable tbl.takeReadLock() lock is
> called, which block the catalog topic update collection till it can take the
> table lock. This can be a serious issue, as loading Iceberg tables can take a
> significant amount of time.
> The skipping logic was implemented in IMPALA-6671 for Hive tables:
> https://github.com/apache/impala/commit/2fccd82590d747d834b8be6f3b05bb446d9bac12
> Test for existing skipping logic:
> https://github.com/apache/impala/blob/31769a7fb50ae1d6b6d69d366a776df441e00e3a/tests/custom_cluster/test_topic_update_frequency.py#L30
> It uses several debug actions to inject delays to reproduce the blocking
> issue. Testing Iceberg tables should be possible in a similar way.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]