[ 
https://issues.apache.org/jira/browse/IMPALA-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18048407#comment-18048407
 ] 

ASF subversion and git services commented on IMPALA-13863:
----------------------------------------------------------

Commit 036c4fdf04a1fb14838e2acaf5b2af2719d67a43 in impala's branch 
refs/heads/master from Arnab Karmakar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=036c4fdf0 ]

IMPALA-13863: Add metric to track number of loaded tables in catalogd

Implements a real-time counter that tracks the number of loaded tables
(non-IncompleteTable instances) in the catalog. This helps monitor
catalog memory pressure and query performance impact from implicit table
invalidation mechanisms.

The counter uses AtomicInteger for thread-safety and is updated across
all table state transitions:
- Incremented when IncompleteTable is replaced with a loaded table
- Decremented when tables are invalidated, dropped, or aged out
- Reset to 0 on global INVALIDATE METADATA

Testing:
Manual verification and automated tests confirm correct
behavior across load, invalidate, drop, and timeout scenarios.

Change-Id: I5aa54f9f7507709b654df22e24592799811e8b6c
Reviewed-on: http://gerrit.cloudera.org:8080/23804
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Show number of loaded tables in metrics
> ---------------------------------------
>
>                 Key: IMPALA-13863
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13863
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Assignee: Arnab Karmakar
>            Priority: Major
>         Attachments: Screenshot 2025-12-23 at 1.40.26 AM.png
>
>
> It'd be helpful to show the number of loaded tables (i.e. not 
> IncompleteTable) in catalogd since there are some mechanisms that will 
> implicitly invalidate tables, e.g. invalidate_tables_on_memory_pressure, 
> invalidate_tables_timeout_s, invalidate_metadata_on_event_processing_failure.
> If few tables are actually loaded, it will impact query performance that many 
> queries will be in the CREATED state waiting for catalogd to load the 
> metadata of their tables. We should tune catalogd, e.g. bumping JVM heap 
> size, for this.
> There are several places that we can track the total number of loaded tables:
>  # While catalogd is collecting catalog updates in getCatalogDelta(), it 
> iterates through all the tables and can count this. However, it takes time 
> and some tables might change the state during the iteration.
>  # When a table is loaded and replaces an IncompleteTable, we bumps the 
> count. And decrease the count when a loaded table is invalidated.
> The 2nd option can show the real time count in metrics. The 1st option can be 
> used to improve logging, e.g. add a log saying "saw N tables are loaded".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to