Michael Smith created IMPALA-13989:
--------------------------------------

             Summary: ALTER TABLE RENAME can fail with concurrent INVALIDATE 
METADATA
                 Key: IMPALA-13989
                 URL: https://issues.apache.org/jira/browse/IMPALA-13989
             Project: IMPALA
          Issue Type: Bug
          Components: Catalog
    Affects Versions: Impala 5.0.0
            Reporter: Michael Smith


IMPALA-13631 removes holding the catalog's versionLock_ writeLock during the 
whole operation (including HMS RPC). That introduces a possible failure mode 
where {{ALTER TABLE RENAME}} fails with
{quote}Table/view rename succeeded in the Hive Metastore, but failed in 
Impala's Catalog Server.{quote}
when {{INVALIDATE METADATA}} is run concurrently. This shows up in the new 
statements added to test_concurrent_ddls.py.

I can reproduce this error by adding a delay after HMS {{alter_table}} RPC 
completes (and before we {{getNextMetastoreEventsForTableIfEnabled}}) and 
running {{INVALIDATE METADATA}} from another session. I think that suggests the 
scenario as:
# {{alter_table}} RPC completes
# Impala {{invalidate metadata}} executes and processes {{alter_table}} event
# {{alterTableOrViewRename}} runs {{catalog_.alterTable}}, but old table has 
already been removed from the catalog so it fails

This should be pretty rare. Running a global {{invalidate metadata}} is a bad 
idea in a production environment as it's akin to restarting catalogd. However I 
think we can address this with better error handling in 
{{alterTableOrViewRename}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to