Sankar Hariappan created HIVE-21286: ---------------------------------------
Summary: Hive should support clean-up of incrementally bootstrapped tables when retry from different dump. Key: HIVE-21286 URL: https://issues.apache.org/jira/browse/HIVE-21286 Project: Hive Issue Type: Bug Components: repl Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan If external tables are enabled for replication on an existing repl policy, then bootstrapping of external tables are combined with incremental dump. If incremental bootstrap load fails with non-retryable error for which user will have to manually drop all the external tables before trying with another bootstrap dump. For full bootstrap, to retry with different dump, we suggested user to drop the DB but in this case they need to manually drop all the external tables which is not so user friendly. So, need to handle it in Hive side as follows. REPL LOAD takes additional config (passed by user in WITH clause) that says, drop all the tables which are part of this bootstrap dump. There are 4 cases possible. 1. Only external tables - Drop all external tables before triggering bootstrap load. 2. Only ACID/MM tables - Drop all ACID/MM tables before triggering bootstrap load. 3. Both external and ACID/MM tables - Drop both external and ACID/MM tables before triggering bootstrap load. 3. Table level replication with bootstrap - Drop all the tables that match the diff in previous and current repl policy (pattern+include/exclude list) before triggering bootstrap load. Configuration: hive.repl.bootstrap.cleanup.type= {1=external_tables, 2=transactional_tables, 3=external_and_transactional_tables, 4=table_level} -- This message was sent by Atlassian JIRA (v7.6.3#76005)