[ 
https://issues.apache.org/jira/browse/IMPALA-13926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Riza Suminto resolved IMPALA-13926.
-----------------------------------
     Fix Version/s: Impala 5.0.0
    Target Version: Impala 5.0.0
        Resolution: Fixed

> Test TestWorkloadManagementInitNoWait failed in arm s3 builds
> -------------------------------------------------------------
>
>                 Key: IMPALA-13926
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13926
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>            Reporter: Yida Wu
>            Assignee: Riza Suminto
>            Priority: Major
>              Labels: broken-build, test-failure
>             Fix For: Impala 5.0.0
>
>
> TestWorkloadManagementInitNoWait has been observed to fail in certain builds, 
> seems related to arm, s3, and data cache, with the following error messages.
> {code:java}
> custom_cluster.test_workload_mgmt_init.TestWorkloadManagementInitNoWait.test_start_invalid_version
> {code}
> Error Message
> {code:java}
> test setup failure
> {code}
> Stacktrace
> {code:java}
> custom_cluster/test_workload_mgmt_init.py:533: in teardown_method
>     self.wait_for_wm_idle()
> common/custom_cluster_test_suite.py:446: in wait_for_wm_idle
>     "impala-server.completed-queries.queued", 0, timeout=timeout_s, 
> interval=1)
> common/impala_service.py:147: in wait_for_metric_value
>     self.__metric_timeout_assert(metric_name, expected_value, timeout, value)
> common/impala_service.py:183: in __metric_timeout_assert
>     self.dump_debug_webpage_json(debug_page, json_filename)
> common/impala_service.py:88: in dump_debug_webpage_json
>     debug_json = self.get_debug_webpage_json(page_name)
> common/impala_service.py:83: in get_debug_webpage_json
>     return json.loads(self.read_debug_webpage(page_name + "?json"))
> common/impala_service.py:79: in read_debug_webpage
>     return self.open_debug_webpage(page_name, timeout=timeout, 
> interval=interval).text
> common/impala_service.py:76: in open_debug_webpage
>     assert 0, 'Debug webpage did not become available in expected time.'
> E   AssertionError: Debug webpage did not become available in expected time.
> {code}
> Standard Error
> {code:java}
> -- 2025-03-29 19:10:02,456 INFO     MainThread: Created temporary dir 
> /data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/logs/custom_cluster_tests/impala_test_minidumps_QqiGby
> -- 2025-03-29 19:10:02,456 INFO     MainThread: Starting cluster with 
> command: 
> /data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/bin/start-impala-cluster.py
>  '--state_store_args=--statestore_update_frequency_ms=50 
> --statestore_priority_update_frequency_ms=50 
> --statestore_heartbeat_frequency_ms=50' --cluster_size=1 --num_coordinators=1 
> --log_dir=/data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/logs/custom_cluster_tests
>  --log_level=1 '--impalad_args=--enable_workload_mgmt=true 
> --query_log_write_interval_s=1 --shutdown_grace_period_s=0 
> --shutdown_deadline_s=60 --logbuflevel=-1 --workload_mgmt_schema_version=foo 
> --minidump_path=/data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/logs/custom_cluster_tests/impala_test_minidumps_QqiGby
>  ' '--state_store_args=--logbuflevel=-1  ' 
> '--catalogd_args=--enable_workload_mgmt=true --logbuflevel=-1 
> --workload_mgmt_schema_version=foo 
> --minidump_path=/data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/logs/custom_cluster_tests/impala_test_minidumps_QqiGby
>  ' '--admissiond_args=--logbuflevel=-1  ' 
> --impalad_args=--default_query_options=
> 19:10:02 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es)
> 19:10:02 MainThread: Starting State Store logging to 
> /data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/logs/custom_cluster_tests/statestored.INFO
> 19:10:03 MainThread: Starting Catalog Service logging to 
> /data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
> 19:10:03 MainThread: Starting Impala Daemon logging to 
> /data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/logs/custom_cluster_tests/impalad.INFO
> 19:10:05 MainThread: Found 1 impalad/1 statestored/1 catalogd process(es)
> 19:10:05 MainThread: Waiting for Impalad webserver port 25000
> 19:10:05 MainThread: Waiting for Impalad webserver port 25000
> 19:10:06 MainThread: Waiting for Impalad webserver port 25000
> 19:10:06 MainThread: Waiting for Impalad webserver port 25000
> 19:10:07 MainThread: Waiting for Impalad webserver port 25000
> 19:10:07 MainThread: Waiting for Impalad webserver port 25000
> 19:10:08 MainThread: Waiting for Impalad webserver port 25000
> 19:10:08 MainThread: Waiting for Impalad webserver port 25000
> 19:10:09 MainThread: Waiting for Impalad webserver port 25000
> 19:10:09 MainThread: Waiting for Impalad webserver port 25000
> 19:10:09 MainThread: Error starting cluster
> Traceback (most recent call last):
>   File 
> "/data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/bin/start-impala-cluster.py",
>  line 1190, in <module>
>     impala_cluster.wait_until_ready(expected_cluster_size, 
> expected_num_ready_impalads)
>   File 
> "/data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/tests/common/impala_cluster.py",
>  line 239, in wait_until_ready
>     impalad.wait_for_webserver(sleep_interval, check_processes_still_running)
>   File 
> "/data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/tests/common/impala_cluster.py",
>  line 636, in wait_for_webserver
>     early_abort_fn()
>   File 
> "/data/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/tests/common/impala_cluster.py",
>  line 234, in check_processes_still_running
>     assert self.catalogd is not None
> AssertionError
> -- 2025-03-29 19:10:09,861 DEBUG    MainThread: Found 1 impalad/1 
> statestored/0 catalogd process(es)
> -- 2025-03-29 19:10:09,863 INFO     MainThread: Expected log lines could not 
> be found, sleeping before retrying: Expected 1 lines in file 
> /data0/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-rhel88-m7g-4xlarge-ondemand-11dc.vpc.cloudera.com.jenkins.log.FATAL.20250329-144759.294272
>  matching regex 'Invalid workload management schema version 'foo'', but found 
> 0 lines. Last line was: 
> . Impalad exiting.
> -- 2025-03-29 19:10:10,864 INFO     MainThread: Expected log lines could not 
> be found, sleeping before retrying: Expected 1 lines in file 
> /data0/jenkins/workspace/impala-cdw-master-core-s3-arm-data-cache/repos/Impala/logs/custom_cluster_tests/impalad.impala-ec2-rhel88-m7g-4xlarge-ondemand-11dc.vpc.cloudera.com.jenkins.log.FATAL.20250329-144759.294272
>  matching regex 'Invalid workload management schema version 'foo'', but found 
> 0 lines. Last line was: 
> . Impalad exiting.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to