Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/23194 )

Change subject: IMPALA-14220 (part 2): Delay AcceptRequest until catalog is 
stable
......................................................................

IMPALA-14220 (part 2): Delay AcceptRequest until catalog is stable

CatalogD availability is improving since reading is_active_ no longer
requires holding catalog_lock_. However, during a failover scenario,
requests may slip into the passive-turn-to-active CatalogD and obtain
stale metadata.

This patch improves the situation in two steps. First, it adds a new
mutex ha_transition_lock_ that must be obtained by AcceptRequest() in HA
mode. This mutex protects both CatalogServer::WaitPendingResetStarts() and
CatalogServer::UpdateActiveCatalogd(). WaitPendingResetStarts() will
only exit and return to AcceptRequest() after the triggered_first_reset_
flag is True (initial metadata reset has completed) or
min_catalog_resets_to_serve_ is met. If only the latter happens,
request will goes through the Catalog JVM and subsequently blocked by
CatalogResetManager.waitOngoingMetadataFetch() until metadata reset has
progress beyond requested database/table.

Second, it increments numCatalogResetStarts_ on every global reset
(Invalidate Metadata) initiated by catalog-server.cc.
CatalogServer::MarkPendingMetadataReset() matches this logic to
increment min_catalog_resets_to_serve_ before setting
triggered_first_reset_ flag to False (consequently waking up
TriggerResetMetadata thread).

Rename WaitForCatalogReady() to
WaitCatalogReadinessForWorkloadManagement() since this wait mechanism is
specific to Workload Management initialization and has stricter
requirements.

Removed CatalogServer::IsActive() since the only call site is replaced
with CatalogServer::WaitHATransition().

Testing:
Added test_metadata_after_failover_with_delayed_reset and
test_metadata_after_failover_with_hms_sync.

Change-Id: I370d21319335318e441ec3c3455bac4227803900
Reviewed-on: http://gerrit.cloudera.org:8080/23194
Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/catalog/catalog.cc
M be/src/catalog/catalog.h
M be/src/catalog/catalogd-main.cc
M be/src/catalog/workload-management-init.cc
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M tests/custom_cluster/test_catalogd_ha.py
11 files changed, 191 insertions(+), 74 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/23194
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I370d21319335318e441ec3c3455bac4227803900
Gerrit-Change-Number: 23194
Gerrit-PatchSet: 10
Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Jason Fehr <jf...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

Reply via email to