[
https://issues.apache.org/jira/browse/IMPALA-14130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17959789#comment-17959789
]
ASF subversion and git services commented on IMPALA-14130:
----------------------------------------------------------
Commit 48c4d31344eeedfb988d7bc2a715f265a23fb0d9 in impala's branch
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=48c4d3134 ]
IMPALA-14130: Remove wait_num_tables arg in start-impala-cluster.py
IMPALA-13850 changed the behavior of bin/start-impala-cluster.py to wait
for the number of tables to be at least one. This is needed to detect
that the catalog has seen at least one update. There is special logic in
dataload to start Impala without tables in that circumstance.
This broke the perf-AB-test job, which starts Impala before loading
data. There are other times when we want to start Impala without tables,
and it is inconvenient to need to specify --wait_num_tables each time.
It is actually not necessary to wait for catalog metric of Coordinator
to reach certain value. Frontend (Coordinator) will not open its service
port until it heard the first catalog topic update form CatalogD.
IMPALA-13850 (part 2) also ensure that CatalogD with
--catalog_topic_mode=minimal will block serving Coordinator request
until it begin its first reset() operation. Therefore, waiting
Coordinator's catalog version is not needed anymore and
--wait_num_tables parameter can be removed.
This patch also slightly change the "progress log" of
start-impala-cluster.py to print the Coordinator's catalog version
instead of num DB and tables cached. The sleep interval time now include
time spent checking Coordinator's metric.
Testing:
- Pass dataload with updated script.
- Manually run start-impala-cluster.py in both legacy and local catalog
mode and confirm it works.
- Pass custom cluster test_concurrent_ddls.py and test_catalogd_ha.py
Change-Id: I4a3956417ec83de4fb3fc2ef1e72eb3641099f02
Reviewed-on: http://gerrit.cloudera.org:8080/22994
Reviewed-by: Csaba Ringhofer <[email protected]>
Tested-by: Riza Suminto <[email protected]>
> start-impala-cluster.py needs to work without tables loaded
> -----------------------------------------------------------
>
> Key: IMPALA-14130
> URL: https://issues.apache.org/jira/browse/IMPALA-14130
> Project: IMPALA
> Issue Type: Task
> Components: Infrastructure
> Affects Versions: Impala 5.0.0
> Reporter: Joe McDonnell
> Assignee: Riza Suminto
> Priority: Major
>
> IMPALA-13850 changed the behavior of bin/start-impala-cluster.py to wait for
> the number of tables to be at least one. This is needed to detect that the
> catalog has seen at least one update. There is special logic in dataload to
> start Impala without tables in that circumstance.
> This broke the perf-AB-test job, which starts Impala before loading data.
> There are other times when we want to start Impala without tables, and it is
> inconvenient to need to specify --wait_num_tables each time.
> We should detect the catalog update by some means other than the number of
> tables, and it needs to work without requiring data to be loaded. Maybe the
> number of databases could work.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]