Sankar Hariappan created HIVE-21286:
---------------------------------------

             Summary: Hive should support clean-up of incrementally 
bootstrapped tables when retry from different dump.
                 Key: HIVE-21286
                 URL: https://issues.apache.org/jira/browse/HIVE-21286
             Project: Hive
          Issue Type: Bug
          Components: repl
    Affects Versions: 4.0.0
            Reporter: Sankar Hariappan
            Assignee: Sankar Hariappan


If external tables are enabled for replication on an existing repl policy, then 
bootstrapping of external tables are combined with incremental dump.
If incremental bootstrap load fails with non-retryable error for which user 
will have to manually drop all the external tables before trying with another 
bootstrap dump. For full bootstrap, to retry with different dump, we suggested 
user to drop the DB but in this case they need to manually drop all the 
external tables which is not so user friendly. So, need to handle it in Hive 
side as follows.

REPL LOAD takes additional config (passed by user in WITH clause) that says, 
drop all the tables which are part of this bootstrap dump. There are 4 cases 
possible.
1. Only external tables - Drop all external tables before triggering bootstrap 
load.
2. Only ACID/MM tables - Drop all ACID/MM tables before triggering bootstrap 
load.
3. Both external and ACID/MM tables - Drop both external and ACID/MM tables 
before triggering bootstrap load.
3. Table level replication with bootstrap - Drop all the tables that match the 
diff in previous and current repl policy (pattern+include/exclude list) before 
triggering bootstrap load.
Configuration: hive.repl.bootstrap.cleanup.type=
{1=external_tables, 2=transactional_tables, 
3=external_and_transactional_tables, 4=table_level}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to