[
https://issues.apache.org/jira/browse/IMPALA-14618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18044050#comment-18044050
]
Quanlong Huang commented on IMPALA-14618:
-----------------------------------------
The output of the test:
{noformat}
-- 2025-12-07 11:45:31,789 INFO MainThread: Starting cluster with command:
/data/jenkins/workspace/impala-asf-master-core-ozone/repos/Impala/bin/start-impala-cluster.py
'--state_store_args=--statestore_update_frequency_ms=50
--statestore_priority_update_frequency_ms=50
--statestore_heartbeat_frequency_ms=50' --cluster_size=3 --num_coordinators=3
--log_dir=/data/jenkins/workspace/impala-asf-master-core-ozone/repos/Impala/logs/custom_cluster_tests
--log_level=1 '--impalad_args=--logbuflevel=-1 '
'--state_store_args=--logbuflevel=-1
--use_subscriber_id_as_catalogd_priority=true '
'--catalogd_args=--logbuflevel=-1
--catalogd_ha_reset_metadata_on_failover=false
--debug_actions=catalogd_event_processing_delay:SLEEP@3000
--catalogd_ha_failover_catchup_timeout_s=2 --enable_reload_events=true
--warmup_tables_config_file=ofs://localhost:9862/impala/test-warehouse/warmup_table_list.txt
' '--admissiond_args=--logbuflevel=-1 ' --enable_catalogd_ha
--impalad_args=--default_query_options=
11:45:32 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es)
11:45:32 MainThread: Starting State Store logging to
/data/jenkins/workspace/impala-asf-master-core-ozone/repos/Impala/logs/custom_cluster_tests/statestored.INFO
11:45:32 MainThread: Starting Catalog Service logging to
/data/jenkins/workspace/impala-asf-master-core-ozone/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
11:45:32 MainThread: Starting Catalog Service logging to
/data/jenkins/workspace/impala-asf-master-core-ozone/repos/Impala/logs/custom_cluster_tests/catalogd_node1.INFO
11:45:32 MainThread: Starting Impala Daemon logging to
/data/jenkins/workspace/impala-asf-master-core-ozone/repos/Impala/logs/custom_cluster_tests/impalad.INFO
11:45:32 MainThread: Starting Impala Daemon logging to
/data/jenkins/workspace/impala-asf-master-core-ozone/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
11:45:32 MainThread: Starting Impala Daemon logging to
/data/jenkins/workspace/impala-asf-master-core-ozone/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
11:45:34 MainThread: Found 3 impalad/1 statestored/2 catalogd process(es)
11:45:34 MainThread: Waiting for Impalad webserver port 25000
11:45:34 MainThread: Waiting for Impalad webserver port 25000
11:45:35 MainThread: Waiting for Impalad webserver port 25000
11:45:35 MainThread: Waiting for Impalad webserver port 25000
11:45:36 MainThread: Waiting for Impalad webserver port 25000
11:45:37 MainThread: Waiting for Impalad webserver port 25000
11:45:37 MainThread: Waiting for Impalad webserver port 25001
11:45:37 MainThread: Waiting for Impalad webserver port 25002
11:45:39 MainThread: Waiting for coordinator client services - hs2 port: 21050
hs2-http port: 28000 beeswax port: 21000
11:45:41 MainThread: Waiting for coordinator client services - hs2 port: 21051
hs2-http port: 28001 beeswax port: 21001
11:45:42 MainThread: Waiting for coordinator client services - hs2 port: 21052
hs2-http port: 28002 beeswax port: 21002
11:45:42 MainThread: Getting num_known_live_backends from
impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:25000
11:45:42 MainThread: num_known_live_backends has reached value: 3
11:45:42 MainThread: Getting num_known_live_backends from
impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:25001
11:45:42 MainThread: num_known_live_backends has reached value: 3
11:45:42 MainThread: Getting num_known_live_backends from
impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:25002
11:45:42 MainThread: num_known_live_backends has reached value: 3
11:45:42 MainThread: Total wait: 8.50s
11:45:42 MainThread: Impala Cluster Running with 3 nodes (3 coordinators, 3
executors).
-- 2025-12-07 11:45:42,893 DEBUG MainThread: Found 3 impalad/1 statestored/2
catalogd process(es)
-- 2025-12-07 11:45:42,893 INFO MainThread: Getting metric:
statestore.live-backends from
impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:25010
-- 2025-12-07 11:45:42,898 INFO MainThread: Metric
'statestore.live-backends' has reached desired value: 5. total_wait: 0s
-- 2025-12-07 11:45:42,898 DEBUG MainThread: Getting num_known_live_backends
from impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:25000
-- 2025-12-07 11:45:42,902 INFO MainThread: num_known_live_backends has
reached value: 3
-- 2025-12-07 11:45:42,902 DEBUG MainThread: Getting num_known_live_backends
from impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:25001
-- 2025-12-07 11:45:42,906 INFO MainThread: num_known_live_backends has
reached value: 3
-- 2025-12-07 11:45:42,906 DEBUG MainThread: Getting num_known_live_backends
from impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:25002
-- 2025-12-07 11:45:42,909 INFO MainThread: num_known_live_backends has
reached value: 3
-- 2025-12-07 11:45:42,910 INFO MainThread: beeswax:
set
client_identifier=custom_cluster/test_catalogd_ha.py::TestCatalogdHA::()::test_failover_catchup_timeout_and_reset;
-- 2025-12-07 11:45:42,910 INFO MainThread: beeswax: connected to
impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:21000 with
beeswax
-- 2025-12-07 11:45:42,910 INFO MainThread: hs2:
set
client_identifier=custom_cluster/test_catalogd_ha.py::TestCatalogdHA::()::test_failover_catchup_timeout_and_reset;
-- 2025-12-07 11:45:42,910 INFO MainThread: hs2: connected to
impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:21050 with
impyla hs2
-- 2025-12-07 11:45:42,910 INFO MainThread: hs2-http:
set
client_identifier=custom_cluster/test_catalogd_ha.py::TestCatalogdHA::()::test_failover_catchup_timeout_and_reset;
-- 2025-12-07 11:45:42,911 INFO MainThread: hs2-http: connected to
impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:28000 with
impyla hs2-http
-- 2025-12-07 11:45:42,912 INFO MainThread: hs2:
set
client_identifier=custom_cluster/test_catalogd_ha.py::TestCatalogdHA::()::test_failover_catchup_timeout_and_reset;
-- 2025-12-07 11:45:42,912 INFO MainThread: hs2: connected to
impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:21050 with
impyla hs2
-- 2025-12-07 11:45:42,912 INFO MainThread: hs2:
set
client_identifier=custom_cluster/test_catalogd_ha.py::TestCatalogdHA::()::test_failover_catchup_timeout_and_reset;
-- 2025-12-07 11:45:42,912 INFO MainThread: hs2: set_configuration:
set sync_ddl=False;
-- 2025-12-07 11:45:42,913 INFO MainThread: hs2: executing against Impala
at impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:21050.
session: f140e5df06946ad8:3da6b8b852406786 main_cursor: True user: None
DROP DATABASE IF EXISTS `test_failover_catchup_timeout_and_reset_5d02c505`
CASCADE;
-- 2025-12-07 11:45:43,199 INFO MainThread:
19406afce7c44eb7:c2969be800000000: query started
-- 2025-12-07 11:45:43,200 INFO MainThread:
19406afce7c44eb7:c2969be800000000: getting log for operation
-- 2025-12-07 11:45:43,200 INFO MainThread:
19406afce7c44eb7:c2969be800000000: getting runtime profile operation
-- 2025-12-07 11:45:43,201 INFO MainThread:
19406afce7c44eb7:c2969be800000000: closing query for operation
-- 2025-12-07 11:45:44,629 INFO MainThread: hs2: executing against Impala
at impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:21050.
session: f140e5df06946ad8:3da6b8b852406786 main_cursor: True user: None
CREATE DATABASE `test_failover_catchup_timeout_and_reset_5d02c505`;
-- 2025-12-07 11:45:44,796 INFO MainThread:
e149c8ad85fe3b8d:26c9d5ff00000000: query started
-- 2025-12-07 11:45:44,796 INFO MainThread:
e149c8ad85fe3b8d:26c9d5ff00000000: getting log for operation
-- 2025-12-07 11:45:44,796 INFO MainThread:
e149c8ad85fe3b8d:26c9d5ff00000000: getting runtime profile operation
-- 2025-12-07 11:45:44,797 INFO MainThread:
e149c8ad85fe3b8d:26c9d5ff00000000: closing query for operation
-- 2025-12-07 11:45:44,797 INFO MainThread: Created database
"test_failover_catchup_timeout_and_reset_5d02c505" for test ID
"custom_cluster/test_catalogd_ha.py::TestCatalogdHA::()::test_failover_catchup_timeout_and_reset"
-- 2025-12-07 11:45:44,797 INFO MainThread: hs2: closing 1 sync and 0 async
hs2 connections to:
impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:21050
-- 2025-12-07 11:45:44,803 INFO MainThread: Status of healthz page at port
25020: 200
-- 2025-12-07 11:45:44,807 INFO MainThread: Status of healthz page at port
25020: 200
-- 2025-12-07 11:45:44,811 INFO MainThread: Status of healthz page at port
25021: 200
-- 2025-12-07 11:45:44,814 INFO MainThread: Status of healthz page at port
25021: 200
-- 2025-12-07 11:45:44,874 INFO MainThread: hs2: executing against Impala
at impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:21050.
session: 3f4ee4fc9a3d2295:9fcfccd7a239d39f main_cursor: True user: None
create table test_failover_catchup_timeout_and_reset_5d02c505.tbl like
functional.alltypes stored as parquet location
'ofs://localhost:9862/impala/test-warehouse/test_failover_catchup_timeout_and_reset_5d02c505.tbl';
-- 2025-12-07 11:45:45,306 INFO MainThread:
9b4d6f0ece167442:054113c500000000: query started
-- 2025-12-07 11:45:45,306 INFO MainThread:
9b4d6f0ece167442:054113c500000000: getting log for operation
-- 2025-12-07 11:45:45,307 INFO MainThread:
9b4d6f0ece167442:054113c500000000: getting runtime profile operation
-- 2025-12-07 11:45:45,307 INFO MainThread:
9b4d6f0ece167442:054113c500000000: closing query for operation
-- 2025-12-07 11:45:45,332 INFO MainThread: Found PID 2336954 for
/data/jenkins/workspace/impala-asf-master-core-ozone/repos/Impala/be/build/latest/service/catalogd
-logbufsecs=5 -v=1 -max_log_files=0 -log_rotation_match_pid=true
-log_filename=catalogd
-log_dir=/data/jenkins/workspace/impala-asf-master-core-ozone/repos/Impala/logs/custom_cluster_tests
-kudu_master_hosts 127.0.0.1 --logbuflevel=-1
--catalogd_ha_reset_metadata_on_failover=false
--debug_actions=catalogd_event_processing_delay:SLEEP@3000
--catalogd_ha_failover_catchup_timeout_s=2 --enable_reload_events=true
--warmup_tables_config_file=ofs://localhost:9862/impala/test-warehouse/warmup_table_list.txt
-catalog_service_port=26000 -state_store_subscriber_port=23020
-webserver_port=25020 -enable_catalogd_ha=true
-- 2025-12-07 11:45:45,354 INFO MainThread: Killing <CatalogdProcess PID:
2336954
(/data/jenkins/workspace/impala-asf-master-core-ozone/repos/Impala/be/build/latest/service/catalogd
-logbufsecs=5 -v=1 -max_log_files=0 -log_rotation_match_pid=true
-log_filename=catalogd
-log_dir=/data/jenkins/workspace/impala-asf-master-core-ozone/repos/Impala/logs/custom_cluster_tests
-kudu_master_hosts 127.0.0.1 --logbuflevel=-1
--catalogd_ha_reset_metadata_on_failover=false
--debug_actions=catalogd_event_processing_delay:SLEEP@3000
--catalogd_ha_failover_catchup_timeout_s=2 --enable_reload_events=true
--warmup_tables_config_file=ofs://localhost:9862/impala/test-warehouse/warmup_table_list.txt
-catalog_service_port=26000 -state_store_subscriber_port=23020
-webserver_port=25020 -enable_catalogd_ha=true)> with signal Signals.SIGKILL
-- 2025-12-07 11:45:45,389 INFO MainThread: Getting metric:
catalog-server.active-status from
impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:25021
-- 2025-12-07 11:45:45,394 INFO MainThread: Waiting for metric value
'catalog-server.active-status'=True. Current value: False. total_wait: 0s
-- 2025-12-07 11:45:45,394 INFO MainThread: Sleeping 1s before next retry.
-- 2025-12-07 11:45:46,395 INFO MainThread: Getting metric:
catalog-server.active-status from
impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:25021
-- 2025-12-07 11:45:46,400 INFO MainThread: Metric
'catalog-server.active-status' has reached desired value: True. total_wait:
1.0060696601867676s
-- 2025-12-07 11:45:46,408 INFO MainThread: Getting metric:
catalog.active-catalogd-address from
impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:25000
-- 2025-12-07 11:45:46,428 INFO MainThread: Waiting for metric value
'catalog.active-catalogd-address'=impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:26001.
Current value:
impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:26000.
total_wait: 0s
-- 2025-12-07 11:45:46,428 INFO MainThread: Sleeping 1s before next retry.
-- 2025-12-07 11:45:47,429 INFO MainThread: Getting metric:
catalog.active-catalogd-address from
impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:25000
-- 2025-12-07 11:45:47,434 INFO MainThread: Waiting for metric value
'catalog.active-catalogd-address'=impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:26001.
Current value:
impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:26000.
total_wait: 1.0214715003967285s
-- 2025-12-07 11:45:47,434 INFO MainThread: Sleeping 1s before next retry.
-- 2025-12-07 11:45:48,435 INFO MainThread: Getting metric:
catalog.active-catalogd-address from
impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:25000
-- 2025-12-07 11:45:48,441 INFO MainThread: Metric
'catalog.active-catalogd-address' has reached desired value:
impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:26001.
total_wait: 2.027543067932129s
-- 2025-12-07 11:45:48,442 INFO MainThread: hs2: executing against Impala
at impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com:21050.
session: 3f4ee4fc9a3d2295:9fcfccd7a239d39f main_cursor: True user: None
describe test_failover_catchup_timeout_and_reset_5d02c505.tbl;{noformat}
> TestCatalogdHA.test_failover_catchup_timeout_and_reset failing
> --------------------------------------------------------------
>
> Key: IMPALA-14618
> URL: https://issues.apache.org/jira/browse/IMPALA-14618
> Project: IMPALA
> Issue Type: Bug
> Reporter: Surya Hebbar
> Assignee: Venugopal Reddy K
> Priority: Major
> Attachments:
> catalogd.impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com.jenkins.log.INFO.20251207-114532.2336954,
>
> catalogd.impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com.jenkins.log.INFO.20251207-114532.2336965,
>
> impalad.impala-ec2-redhat86-m6i-4xlarge-ondemand-1ce4.vpc.cloudera.com.jenkins.log.INFO.20251207-114532.2337035
>
>
> h3. Error Message
> {code}
> assert '26000' in "Query 294e9bcf7bfb219a:72be327b00000000
> failed:\nAnalysisException: Could not resolve path:
> 'test_failover_catchup_timeout_and_reset_5d02c505.tbl'\n\n" + where '26000'
> = str(26000) + where 26000 = <tests.common.impala_service.CatalogdService
> object at 0x7f51b29a3520>.service_port + where
> <tests.common.impala_service.CatalogdService object at 0x7f51b29a3520> =
> <CatalogdProcess PID: None
> (/data/jenkins/workspace/impala-asf-master-core-ozone/repos/Impala/be/build/latest/service/...ist.txt
> -catalog_service_port=26000 -state_store_subscriber_port=23020
> -webserver_port=25020 -enable_catalogd_ha=true)>.service + and "Query
> 294e9bcf7bfb219a:72be327b00000000 failed:\nAnalysisException: Could not
> resolve path: 'test_failover_catchup_timeout_and_reset_5d02c505.tbl'\n\n" =
> str(HiveServer2Error("Query 294e9bcf7bfb219a:72be327b00000000
> failed:\nAnalysisException: Could not resolve path:
> 'test_failover_catchup_timeout_and_reset_5d02c505.tbl'\n\n"))
> {code}
> h3. Stacktrace
> {code}
> custom_cluster/test_catalogd_ha.py:608: in
> test_failover_catchup_timeout_and_reset
> self._test_metadata_after_failover(
> custom_cluster/test_catalogd_ha.py:787: in _test_metadata_after_failover
> assert str(active_catalogd.service.service_port) in str(e)
> E assert '26000' in "Query 294e9bcf7bfb219a:72be327b00000000
> failed:\nAnalysisException: Could not resolve path:
> 'test_failover_catchup_timeout_and_reset_5d02c505.tbl'\n\n"
> E + where '26000' = str(26000)
> E + where 26000 = <tests.common.impala_service.CatalogdService object
> at 0x7f51b29a3520>.service_port
> E + where <tests.common.impala_service.CatalogdService object at
> 0x7f51b29a3520> = <CatalogdProcess PID: None
> (/data/jenkins/workspace/impala-asf-master-core-ozone/repos/Impala/be/build/latest/service/...ist.txt
> -catalog_service_port=26000 -state_store_subscriber_port=23020
> -webserver_port=25020 -enable_catalogd_ha=true)>.service
> E + and "Query 294e9bcf7bfb219a:72be327b00000000
> failed:\nAnalysisException: Could not resolve path:
> 'test_failover_catchup_timeout_and_reset_5d02c505.tbl'\n\n" =
> str(HiveServer2Error("Query 294e9bcf7bfb219a:72be327b00000000
> failed:\nAnalysisException: Could not resolve path:
> 'test_failover_catchup_timeout_and_reset_5d02c505.tbl'\n\n"))
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]