Jason Dere created HIVE-20998:
---------------------------------
Summary: HiveStrictManagedMigration utility should update DB/Table
location as last migration steps
Key: HIVE-20998
URL: https://issues.apache.org/jira/browse/HIVE-20998
Project: Hive
Issue Type: Sub-task
Reporter: Jason Dere
Assignee: Jason Dere
When processing a database or table, the HiveStrictManagedMigration utility
currently changes the database/table locations as the first step in processing
that database/table. Unfortunately if an error occurs while processing this
database or table, then there may still be migration work that needs to
continue for that db/table by running the migration again. However the
migration tool only processes dbs/tables that have the old warehouse location,
then the tool will skip over the db/table when the migration is run again.
One fix here is to set the new location as the last step after all of the
migration work is done:
- The new table location will not be set until all of its partitions have been
successfully migrated.
- The new database location will not be set until all of its tables have been
successfully migrated.
For existing migrations that failed with an error, the following workaround can
be done so that the db/tables can be re-processed by the migration tool:
1) Use the migration tool logs to find which databases/tables failed during
processing.
2) For each db/table, change location of of the database and table back to old
location:
ALTER DATABASE tpcds_bin_partitioned_orc_10 SET LOCATION
'hdfs://ns1/apps/hive/warehouse/tpcds_bin_partitioned_orc_10.db';
ALTER TABLE tpcds_bin_partitioned_orc_10.store_sales SET LOCATION
'hdfs://ns1/apps/hive/warehouse/tpcds_bin_partitioned_orc_10.db/store_sales';
2) Rerun the migration tool
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)