Hello Riza Suminto, Michael Smith, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/22511
to look at the new patch set (#8).
Change subject: IMPALA-13772: Fix Workload Management DMLs Timeouts
......................................................................
IMPALA-13772: Fix Workload Management DMLs Timeouts
The insert DMLs executed by workload management to add rows to the
completed queries Iceberg table time out after 10 seconds because
that is the default FETCH_ROWS_TIMEOUT_MS value. If the DML queues up
in admission control, this timeout will quickly cause the DML to be
cancelled. The fix is to set the FETCH_ROWS_TIMEOUT_MS query option
to 0 for the workload management insert DMLs.
Even though the workload management DMLs do not retrieve any rows,
the FETCH_ROWS_TIMEOUT_MS value still applies because the internal
server functions call into the client request state's
ExecQueryOrDmlRequest() function which starts the query executing and
immediately returnsi. Then, the BlockOnWait function in
impala-server.cc is called. This function times out based on the
FETCH_ROWS_TIMEOUT_MS value.
A new coordinator startup flag 'query_log_dml_exec_timeout_s' is
added to specify the EXEC_TIME_LIMIT_S query option on the workload
management insert DML statements. This flag ensures the DMLs will
time out if they do not complete in a reasonable timeframe.
While adding the new coordinator startup flag, a bug in the
internal-server code was discovered. This bug caused a return status
of 'ok' even when the query exec time limit was reached and the query
cancelled. This bug has also been fixed.
Testing:
1. Added new custom cluster test that simulates a busy cluster where
the workload management DML queues for longer than 10 seconds.
2. Existing tests in test_query_log and test_admission_controller
passed.
3. One internal-server-test ctest was modified to assert for a
returned status of error when a query is cancelled.
4. Added a new cusom cluster test that asserts the workload
management DML is cancelled based on the value of the new
coordinator startup flag.
Change-Id: I0cc7fbce40eadfb253d8cff5cbb83e2ad63a979f
---
M be/src/service/internal-server-test.cc
M be/src/service/internal-server.cc
M be/src/service/workload-management-worker.cc
M be/src/workload_mgmt/workload-management-flags.cc
A fe/src/test/resources/fair-scheduler-one-query.xml
A fe/src/test/resources/llama-site-one-query.xml
M tests/common/cluster_config.py
M tests/custom_cluster/test_admission_controller.py
M tests/custom_cluster/test_query_log.py
9 files changed, 286 insertions(+), 102 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/22511/8
--
To view, visit http://gerrit.cloudera.org:8080/22511
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0cc7fbce40eadfb253d8cff5cbb83e2ad63a979f
Gerrit-Change-Number: 22511
Gerrit-PatchSet: 8
Gerrit-Owner: Jason Fehr <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jason Fehr <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>