Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22634 )

Change subject: IMPALA-13850: Wait until CatalogD active before resetting 
metadata
......................................................................


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/22634/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/22634/5//COMMIT_MSG@13
PS5, Line 13: registration.
Let's mention the lock is held by GatherCatalogUpdatesThread which is calling 
GetCatalogDelta(). This waits for the java lock versionLock_ which is held by 
the thread doing CatalogServiceCatalog.reset().


http://gerrit.cloudera.org:8080/#/c/22634/5//COMMIT_MSG@23
PS5, Line 23: catalog.num-tables, or /catalog page content.
I think coordinators will still wait for the initial catalog updates from 
statestore:
https://github.com/apache/impala/blob/ddd4f4f8d68addce1542d57f94c637210a090150/be/src/service/impala-server.cc#L3159
before it marks itself as ready:
https://github.com/apache/impala/blob/ddd4f4f8d68addce1542d57f94c637210a090150/be/src/service/impala-server.cc#L3381-L3382

Will the coordinator also get killed by k8s due to the healthy check time out?


http://gerrit.cloudera.org:8080/#/c/22634/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/22634/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2322
PS5, Line 2322:           
DebugUtils.executeDebugAction(BackendConfig.INSTANCE.debugActions(),
              :               DebugUtils.RESET_METADATA_LOOP_LOCKED);
What about moving this outside of the loop so the wait time is independent to 
the number of dbs? I think we have different numbers of dbs between core and 
exhaustive builds.


http://gerrit.cloudera.org:8080/#/c/22634/5/tests/custom_cluster/test_catalogd_ha.py
File tests/custom_cluster/test_catalogd_ha.py:

http://gerrit.cloudera.org:8080/#/c/22634/5/tests/custom_cluster/test_catalogd_ha.py@109
PS5, Line 109:       assert page.status_code == requests.codes.ok
Can we also verify the coordinator is healthy?



--
To view, visit http://gerrit.cloudera.org:8080/22634
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58cc66dcccedb306ff11893f2916ee5ee6a3efc1
Gerrit-Change-Number: 22634
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Mon, 24 Mar 2025 02:43:11 +0000
Gerrit-HasComments: Yes

Reply via email to