Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22816 )

Change subject: IMPALA-13829: Postpone catalog deleteLog GC for waitForHmsEvent 
requests
......................................................................


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/22816/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/22816/2//COMMIT_MSG@27
PS2, Line 27: postponing
> nit: postponing
Done


http://gerrit.cloudera.org:8080/#/c/22816/2//COMMIT_MSG@33
PS2, Line 33: A new flag, catalog_delete_log_gc_frequency, is added for this. 
The
            : deleteLog GC happens in every N+1 
(N=catalog_delete_log_gc_frequency)
            : topic updates.
> How many seconds between GC does it translate to in default config?
Catalog updates will only be sent out when there are catalog changes (due to 
DDL/DML/HMS events). Assuming catalog keeps changing, the catalog updates are 
supposed to be sent every 2 seconds (configured by 
statestore_update_frequency_ms). However, catalog update thread could be 
blocked by table locks holding by concurrent DDLs. The actual interval is 
usually larger than 2s. Sometimes could be minutes depending on how long the 
DDL holds the table lock.

When the GC happens, only items before the last 1000th topic updates are 
cleared. So an item could survive for 2000 rounds of topic updates. Assuming 
catalog updates are sent at the fastest speed, 2000 rounds of topic updates 
means 4000s.

sync_hms_events_wait_time_s is the timeout used to wait for HMS events to be 
processed. It doesn't matter if deleteLog GC happens in the middle of the wait, 
since we assume 1000 rounds of catalog updates are enough for the impalad to 
received the GCed deletions.


http://gerrit.cloudera.org:8080/#/c/22816/2/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/22816/2/be/src/catalog/catalog-server.cc@177
PS2, Line 177: catalog_delete_log_gc_frequency
> Please add validator that this flag is always a positive number. Please do
Done


http://gerrit.cloudera.org:8080/#/c/22816/2/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java:

http://gerrit.cloudera.org:8080/#/c/22816/2/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java@137
PS2, Line 137: <=
> nit: <= for safety.
Done


http://gerrit.cloudera.org:8080/#/c/22816/2/tests/metadata/test_event_processing.py
File tests/metadata/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/22816/2/tests/metadata/test_event_processing.py@626
PS2, Line 626:       assert False, "Failed to drop dat
> Assert that this is success?
It seems we don't need assertion here. If it fails, it raises an exception:

tests/metadata/test_event_processing.py:626: in test_hms_event_sync_on_deletion
    client.execute("create database " + db)
tests/common/impala_connection.py:505: in execute
    fetch_exec_summary=fetch_exec_summary)
tests/beeswax/impala_beeswax.py:195: in execute
    handle = self.__execute_query(query_string.strip(), user=user)
tests/beeswax/impala_beeswax.py:291: in __execute_query
    handle = self.execute_query_async(query_string, user=user)
tests/beeswax/impala_beeswax.py:285: in execute_query_async
    handle = self.__do_rpc(lambda: self.imp_service.query(query,))
tests/beeswax/impala_beeswax.py:470: in __do_rpc
    raise ImpalaBeeswaxException(self.__build_error_message(b), b)
E   ImpalaBeeswaxException: Query 9c413ce6f3c038e5:c6ebe8ce00000000 failed:
E   AnalysisException: Database already exists: 
test_hms_event_sync_on_deletion_115fe4d9_db


http://gerrit.cloudera.org:8080/#/c/22816/2/tests/metadata/test_event_processing.py@640
PS2, Line 640:           self.hive_client.drop_table(db, tbl_name, 
deleteData=True)
             :           LOG.info("Dropped table {}.{} in Hive".format(db, 
tbl_name))
> Wrap L671 to L639 in try block.
Done



--
To view, visit http://gerrit.cloudera.org:8080/22816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2441440bca2b928205dd514047ba742a5e8bf05e
Gerrit-Change-Number: 22816
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Fri, 25 Apr 2025 06:59:14 +0000
Gerrit-HasComments: Yes

Reply via email to