Yida Wu created IMPALA-13926: -------------------------------- Summary: Test TestWorkloadManagementInitNoWait failed in arm s3 builds Key: IMPALA-13926 URL: https://issues.apache.org/jira/browse/IMPALA-13926 Project: IMPALA Issue Type: Bug Components: Backend Reporter: Yida Wu Assignee: Jason Fehr
TestWorkloadManagementInitNoWait has been observed to fail in certain builds, seems related to arm, s3, and data cache, with the following error messages. {code:java} custom_cluster.test_workload_mgmt_init.TestWorkloadManagementInitNoWait.test_start_invalid_version {code} Error Message {code:java} test setup failure {code} Stacktrace {code:java} custom_cluster/test_workload_mgmt_init.py:533: in teardown_method self.wait_for_wm_idle() common/custom_cluster_test_suite.py:446: in wait_for_wm_idle "impala-server.completed-queries.queued", 0, timeout=timeout_s, interval=1) common/impala_service.py:147: in wait_for_metric_value self.__metric_timeout_assert(metric_name, expected_value, timeout, value) common/impala_service.py:183: in __metric_timeout_assert self.dump_debug_webpage_json(debug_page, json_filename) common/impala_service.py:88: in dump_debug_webpage_json debug_json = self.get_debug_webpage_json(page_name) common/impala_service.py:83: in get_debug_webpage_json return json.loads(self.read_debug_webpage(page_name + "?json")) common/impala_service.py:79: in read_debug_webpage return self.open_debug_webpage(page_name, timeout=timeout, interval=interval).text common/impala_service.py:76: in open_debug_webpage assert 0, 'Debug webpage did not become available in expected time.' E AssertionError: Debug webpage did not become available in expected time. {code} Standard Error {code:java} -- 2025-03-29 19:10:02,456 INFO MainThread: Created temporary dir /data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/logs/custom_cluster_tests/impala_test_minidumps_QqiGby -- 2025-03-29 19:10:02,456 INFO MainThread: Starting cluster with command: /data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/bin/start-impala-cluster.py '--state_store_args=--statestore_update_frequency_ms=50 --statestore_priority_update_frequency_ms=50 --statestore_heartbeat_frequency_ms=50' --cluster_size=1 --num_coordinators=1 --log_dir=/data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/logs/custom_cluster_tests --log_level=1 '--impalad_args=--enable_workload_mgmt=true --query_log_write_interval_s=1 --shutdown_grace_period_s=0 --shutdown_deadline_s=60 --logbuflevel=-1 --workload_mgmt_schema_version=foo --minidump_path=/data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/logs/custom_cluster_tests/impala_test_minidumps_QqiGby ' '--state_store_args=--logbuflevel=-1 ' '--catalogd_args=--enable_workload_mgmt=true --logbuflevel=-1 --workload_mgmt_schema_version=foo --minidump_path=/data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/logs/custom_cluster_tests/impala_test_minidumps_QqiGby ' '--admissiond_args=--logbuflevel=-1 ' --impalad_args=--default_query_options= 19:10:02 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es) 19:10:02 MainThread: Starting State Store logging to /data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/logs/custom_cluster_tests/statestored.INFO 19:10:03 MainThread: Starting Catalog Service logging to /data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/logs/custom_cluster_tests/catalogd.INFO 19:10:03 MainThread: Starting Impala Daemon logging to /data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/logs/custom_cluster_tests/impalad.INFO 19:10:05 MainThread: Found 1 impalad/1 statestored/1 catalogd process(es) 19:10:05 MainThread: Waiting for Impalad webserver port 25000 19:10:05 MainThread: Waiting for Impalad webserver port 25000 19:10:06 MainThread: Waiting for Impalad webserver port 25000 19:10:06 MainThread: Waiting for Impalad webserver port 25000 19:10:07 MainThread: Waiting for Impalad webserver port 25000 19:10:07 MainThread: Waiting for Impalad webserver port 25000 19:10:08 MainThread: Waiting for Impalad webserver port 25000 19:10:08 MainThread: Waiting for Impalad webserver port 25000 19:10:09 MainThread: Waiting for Impalad webserver port 25000 19:10:09 MainThread: Waiting for Impalad webserver port 25000 19:10:09 MainThread: Error starting cluster Traceback (most recent call last): File "/data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/bin/start-impala-cluster.py", line 1190, in <module> impala_cluster.wait_until_ready(expected_cluster_size, expected_num_ready_impalads) File "/data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/tests/common/impala_cluster.py", line 239, in wait_until_ready impalad.wait_for_webserver(sleep_interval, check_processes_still_running) File "/data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/tests/common/impala_cluster.py", line 636, in wait_for_webserver early_abort_fn() File "/data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/tests/common/impala_cluster.py", line 234, in check_processes_still_running assert self.catalogd is not None AssertionError -- 2025-03-29 19:10:09,861 DEBUG MainThread: Found 1 impalad/1 statestored/0 catalogd process(es) -- 2025-03-29 19:10:09,863 INFO MainThread: Expected log lines could not be found, sleeping before retrying: Expected 1 lines in file /data0/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-rhel88-m7g-4xlarge-ondemand-11dc.vpc.cloudera.com.jenkins.log.FATAL.20250329-144759.294272 matching regex 'Invalid workload management schema version 'foo'', but found 0 lines. Last line was: . Impalad exiting. -- 2025-03-29 19:10:10,864 INFO MainThread: Expected log lines could not be found, sleeping before retrying: Expected 1 lines in file /data0/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-rhel88-m7g-4xlarge-ondemand-11dc.vpc.cloudera.com.jenkins.log.FATAL.20250329-144759.294272 matching regex 'Invalid workload management schema version 'foo'', but found 0 lines. Last line was: . Impalad exiting. {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)