[ https://issues.apache.org/jira/browse/HIVE-8730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220036#comment-14220036 ]
Ryan Pridgeon commented on HIVE-8730: ------------------------------------- The schema tool also fails to update PART_NAME in table PARTITIONS. Without this Hive will enter an infinite loop when a user attempts to drop the table. I would include the following Update statement in addition to handling none date type entries such as __HIVE_DEFAULT_PARTITION__ UPDATE PARTITIONS JOIN PARTITION_KEYS ON PARTITIONS.TBL_ID= PARTITION_KEYS.TBL_ID SET PART_NAME=ifnull(REPLACE(PART_NAME,RIGHT(PART_NAME,8),cast(RIGHT(PART_NAME,8) as date)),PART_NAME) WHERE PKEY_TYPE= 'date' and PART_NAME not like '%-%-%'; Ref:http://github.mtv.cloudera.com/CDH/hive/blob/cdh5-0.13.1_5.2.0/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#1504 LINE 1480: private List<Path> dropPartitionsAndGetLocations(RawStore ms, String dbName, String tableName, Path tablePath, List<FieldSchema> partitionKeys, boolean checkLocation) throws MetaException, IOException, NoSuchObjectException, InvalidObjectException, InvalidInputException { int partitionBatchSize = HiveConf.getIntVar(hiveConf, ConfVars.METASTORE_BATCH_RETRIEVE_MAX); Path tableDnsPath = null; if (tablePath != null) { tableDnsPath = wh.getDnsPath(tablePath); } List<Path> partPaths = new ArrayList<Path>(); Table tbl = ms.getTable(dbName, tableName); // call dropPartition on each of the table's partitions to follow the // procedure for cleanly dropping partitions. while (true) { List<Partition> partsToDelete = ms.getPartitions(dbName, tableName, partitionBatchSize); if (partsToDelete == null || partsToDelete.isEmpty()) { break; } List<String> partNames = new ArrayList<String>(); for (Partition part : partsToDelete) { if (checkLocation && part.getSd() != null && part.getSd().getLocation() != null) { Path partPath = wh.getDnsPath(new Path(part.getSd().getLocation())); if (tableDnsPath == null || (partPath != null && !isSubdirectory(tableDnsPath, partPath))) { if (!wh.isWritable(partPath.getParent())) { throw new MetaException("Table metadata not deleted since the partition " + Warehouse.makePartName(partitionKeys, part.getValues()) + " has parent location " + partPath.getParent() + " which is not writable " + "by " + hiveConf.getUser()); } partPaths.add(partPath); } } partNames.add(Warehouse.makePartName(tbl.getPartitionKeys(), part.getValues())); } ms.dropPartitions(dbName, tableName, partNames); } return partPaths; } > schemaTool failure when date partition has non-date value > --------------------------------------------------------- > > Key: HIVE-8730 > URL: https://issues.apache.org/jira/browse/HIVE-8730 > Project: Hive > Issue Type: Bug > Components: Metastore > Affects Versions: 0.13.0 > Environment: CDH5.2 > Reporter: Johndee Burks > Assignee: Chaoyu Tang > Priority: Minor > > If there is a none date value in the PART_KEY_VAL column within the > PARTITION_KEY_VALS table in the metastore db, this will cause the HIVE-5700 > script to fail. The failure will be picked up by the schemaTool causing the > upgrade to fail. A classic example of a value that can be present without > users really being aware is __HIVE_DEFAULT_PARTITION__ which is filled in by > hive automatically when doing dynamic partitioning and value is not present > in source data for the partition column. > The reason for the failure is that the upgrade script does not account for > none date values. What it is currently: > {code} > UPDATE PARTITION_KEY_VALS > INNER JOIN PARTITIONS ON PARTITION_KEY_VALS.PART_ID = PARTITIONS.PART_ID > INNER JOIN PARTITION_KEYS ON PARTITION_KEYS.TBL_ID = PARTITIONS.TBL_ID > AND PARTITION_KEYS.INTEGER_IDX = PARTITION_KEY_VALS.INTEGER_IDX > AND PARTITION_KEYS.PKEY_TYPE = 'date' > SET PART_KEY_VAL = IFNULL(DATE_FORMAT(cast(PART_KEY_VAL as date),'%Y-%m-%d'), > PART_KEY_VAL); > {code} > What it should be to avoid issue: > {code} > UPDATE PARTITION_KEY_VALS > INNER JOIN PARTITIONS ON PARTITION_KEY_VALS.PART_ID = PARTITIONS.PART_ID > INNER JOIN PARTITION_KEYS ON PARTITION_KEYS.TBL_ID = PARTITIONS.TBL_ID > AND PARTITION_KEYS.INTEGER_IDX = PARTITION_KEY_VALS.INTEGER_IDX > AND PARTITION_KEYS.PKEY_TYPE = 'date' > AND PART_KEY_VAL != '__HIVE_DEFAULT_PARTITION__' > SET PART_KEY_VAL = IFNULL(DATE_FORMAT(cast(PART_KEY_VAL as date),'%Y-%m-%d'), > PART_KEY_VAL); > {code} > == Metastore DB > {code} > mysql> select * from PARTITION_KEY_VALS; > +---------+----------------------------+-------------+ > | PART_ID | PART_KEY_VAL | INTEGER_IDX | > +---------+----------------------------+-------------+ > | 171 | 2099-12-31 | 0 | > | 172 | __HIVE_DEFAULT_PARTITION__ | 0 | > | 184 | 2099-12-01 | 0 | > | 185 | 2099-12-30 | 0 | > +---------+----------------------------+-------------+ > {code} > == stdout.log > {code} > 0: jdbc:mysql://10.16.8.121:3306/metastore> !autocommit on > 0: jdbc:mysql://10.16.8.121:3306/metastore> SELECT 'Upgrading MetaStore > schema from 0.12.0 to 0.13.0' AS ' ' > +---------------------------------------------------+--+ > | | > +---------------------------------------------------+--+ > | Upgrading MetaStore schema from 0.12.0 to 0.13.0 | > +---------------------------------------------------+--+ > 0: jdbc:mysql://10.16.8.121:3306/metastore> SELECT '< HIVE-5700 enforce > single date format for partition column storage >' AS ' ' > +------------------------------------------------------------------------+--+ > | | > +------------------------------------------------------------------------+--+ > | < HIVE-5700 enforce single date format for partition column storage > | > +------------------------------------------------------------------------+--+ > 0: jdbc:mysql://10.16.8.121:3306/metastore> UPDATE PARTITION_KEY_VALS INNER > JOIN PARTITIONS ON PARTITION_KEY_VALS.PART_ID = PARTITIONS.PART_ID INNER JOIN > PARTITION_KEYS ON PARTITION_KEYS.TBL_ID = PARTITIONS.TBL_ID AND > PARTITION_KEYS.INTEGER_IDX = PARTITION_KEY_VALS.INTEGER_IDX AND > PARTITION_KEYS.PKEY_TYPE = 'date' SET PART_KEY_VAL = > IFNULL(DATE_FORMAT(cast(PART_KEY_VAL as date),'%Y-%m-%d'), PART_KEY_VAL) > {code} > == stderr.log > {code} > exec /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop/bin/hadoop > jar > /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hive/lib/hive-cli-0.13.1-cdh5.2.0.jar > org.apache.hive.beeline.HiveSchemaTool -verbose -dbType mysql -upgradeSchema > Connecting to > jdbc:mysql://10.16.8.121:3306/metastore?useUnicode=true&characterEncoding=UTF-8 > Connected to: MySQL (version 5.1.73) > Driver: MySQL-AB JDBC Driver (version mysql-connector-java-5.1.17-SNAPSHOT ( > Revision: ${bzr.revision-id} )) > Transaction isolation: TRANSACTION_READ_COMMITTED > Autocommit status: true > 1 row selected (0.025 seconds) > 1 row selected (0.004 seconds) > Closing: 0: > jdbc:mysql://10.16.8.121:3306/metastore?useUnicode=true&characterEncoding=UTF-8 > org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore > state would be inconsistent !! > org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore > state would be inconsistent !! > at > org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:252) > at > org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:220) > at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:530) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.main(RunJar.java:212) > Caused by: java.io.IOException: Schema script failed, errorcode 2 > at > org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:410) > at > org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:383) > at > org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:247) > ... 7 more > *** schemaTool failed *** > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)