[ 
https://issues.apache.org/jira/browse/IMPALA-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer updated IMPALA-14801:
-------------------------------------
    Description: 
https://github.com/apache/impala/blob/31769a7fb50ae1d6b6d69d366a776df441e00e3a/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L1819

lockTableAndAddToCatalogDelta() calls lockHdfsTblWithTimeout() only for 
HdfsTable, which means that for IcerbergTable  tbl.takeReadLock() lock is 
called, which block the catalog topic update collection till it can take the 
table lock. This can be a serious issue, as loading Iceberg tables can take a 
significant amount of time.

The skipping logic was implemented in IMPALA-6671 for Hive tables: 
https://github.com/apache/impala/commit/2fccd82590d747d834b8be6f3b05bb446d9bac12

Test for existing skipping logic: 
https://github.com/apache/impala/blob/31769a7fb50ae1d6b6d69d366a776df441e00e3a/tests/custom_cluster/test_topic_update_frequency.py#L30
It uses several debug actions to inject delays to reproduce the blocking issue. 
Testing Iceberg tables should be possible in a similar way.

> Catalog topic update creation can't skip Iceberg tables
> -------------------------------------------------------
>
>                 Key: IMPALA-14801
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14801
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog
>            Reporter: Csaba Ringhofer
>            Assignee: Mihaly Szjatinya
>            Priority: Critical
>
> https://github.com/apache/impala/blob/31769a7fb50ae1d6b6d69d366a776df441e00e3a/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L1819
> lockTableAndAddToCatalogDelta() calls lockHdfsTblWithTimeout() only for 
> HdfsTable, which means that for IcerbergTable  tbl.takeReadLock() lock is 
> called, which block the catalog topic update collection till it can take the 
> table lock. This can be a serious issue, as loading Iceberg tables can take a 
> significant amount of time.
> The skipping logic was implemented in IMPALA-6671 for Hive tables: 
> https://github.com/apache/impala/commit/2fccd82590d747d834b8be6f3b05bb446d9bac12
> Test for existing skipping logic: 
> https://github.com/apache/impala/blob/31769a7fb50ae1d6b6d69d366a776df441e00e3a/tests/custom_cluster/test_topic_update_frequency.py#L30
> It uses several debug actions to inject delays to reproduce the blocking 
> issue. Testing Iceberg tables should be possible in a similar way.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to