Oleksandr Polishchuk created HIVE-21302: -------------------------------------------
Summary: datanucleus.schema.autoCreateAll=true will not work properly as schematool Key: HIVE-21302 URL: https://issues.apache.org/jira/browse/HIVE-21302 Project: Hive Issue Type: Bug Affects Versions: 2.3.4, 2.1.0 Reporter: Oleksandr Polishchuk The bug was found while working with configured environment: {{Apache Hadoop cluster}} and {{Apache Hive}} (the same issue on both versions 2.1 and 2.3). Configuration of working environment was performed in the following steps: # Installed Apache Hadoop cluster # Installed over cluster - Hive component # Configured properties in {{hive-site.xml}} {code:xml} <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:derby:;databaseName=test_db;create=true</value> <description>the URL of the Derby database</description> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>org.apache.derby.jdbc.EmbeddedDriver</value> <description>Driver class name for a JDBC metastore</description> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/usr/hive/warehouse</value> <description>location of default database for the warehouse</description> </property> <property> <name>datanucleus.schema.autoCreateAll</name> <value>true</value> <description>creates necessary schema on a startup if one doesnt exist.</description> </property> <property> <name>hive.metastore.schema.verification</name> <value>false</value> </property> {code} # Installed {{[IJ|http://db.apache.org/derby/papers/DerbyTut/index.html]}} utility to work with embedded DB, according to steps in the guide. # Launched services: {{hiveserver2}} and {{metastore}} ** {code:java} hive --service hiveserver2 {code} ** {code:java} hive --service metastore {code} # The next step was to start the utility in the next steps: ** {code:java} $ cd $DERBY_INSTALL/bin {code} ** {code:java} $ . setEmbeddedCP {code} ** {code:java} $ echo $CLASSPATH /opt/Apache/db-derby-10.14.2.0-bin/lib/derby.jar:/opt/Apache/db-derby-10.14.2.0-bin/lib/derbytools.jar: {code} ** Start up {{ij}} with this command: {{./ij}} {code:java} connect 'jdbc:derby:/opt/hadoop/hive/metastore_db'; {code} But after performing that command appears the next error. {code:java} ij> connect 'jdbc:derby:/opt/hadoop/hive/metastore_db'; ERROR XJ040: Failed to start database '/opt/hadoop/hive/metastore_db' with class loader sun.misc.Launcher$AppClassLoader@5e2de80c, see the next exception for details. ERROR XSDB6: Another instance of Derby may have already booted the database /opt/hadoop/hive/metastore_db. {code} To resolve this error need to stop all hive services, like hiveserver2 and metastore. Whereupon again enter the command for connect to DB connect {{'jdbc:derby:/opt/hadoop/hive/metastore_db';}} that was successful. # As far as metastore service was launched, he generated all tables lists in the [package.jdo|https://github.com/apache/hive/blob/branch-2.3/metastore/src/model/package.jdo] file, which we could see if perform the next command in {{ij}} utility: {{ij(CONNECTION1)> show tables;}} {code:java} ij> connect 'jdbc:derby:/opt/hadoop/hive/metastore_db'; ij(CONNECTION1)> show tables; TABLE_SCHEM |TABLE_NAME |REMARKS ------------------------------------------------------------------------ SYS |SYSALIASES | SYS |SYSCHECKS | SYS |SYSCOLPERMS | SYS |SYSCOLUMNS | SYS |SYSCONGLOMERATES | SYS |SYSCONSTRAINTS | SYS |SYSDEPENDS | SYS |SYSFILES | SYS |SYSFOREIGNKEYS | SYS |SYSKEYS | SYS |SYSPERMS | SYS |SYSROLES | SYS |SYSROUTINEPERMS | SYS |SYSSCHEMAS | SYS |SYSSEQUENCES | SYS |SYSSTATEMENTS | SYS |SYSSTATISTICS | SYS |SYSTABLEPERMS | SYS |SYSTABLES | SYS |SYSTRIGGERS | SYS |SYSUSERS | SYS |SYSVIEWS | SYSIBM |SYSDUMMY1 | APP |BUCKETING_COLS | APP |CDS | APP |COLUMNS_V2 | APP |DATABASE_PARAMS | APP |DBS | APP |GLOBAL_PRIVS | APP |PARTITIONS | APP |PARTITION_KEYS | APP |PARTITION_KEY_VALS | APP |PARTITION_PARAMS | APP |PART_COL_STATS | APP |ROLES | APP |SDS | APP |SD_PARAMS | APP |SEQUENCE_TABLE | APP |SERDES | APP |SERDE_PARAMS | APP |SKEWED_COL_NAMES | APP |SKEWED_COL_VALUE_LOC_MAP | APP |SKEWED_STRING_LIST | APP |SKEWED_STRING_LIST_VALUES | APP |SKEWED_VALUES | APP |SORT_COLS | APP |TABLE_PARAMS | APP |TAB_COL_STATS | APP |TBLS | APP |VERSION | 50 rows selected {code} # But if instead of start/stop services use the next command: {{/opt/hadoop/hive/bin/schematool -dbType derby -initSchema}}. The result will be fundamentally different because the this command involve {{hive-schema-2.1.0.mysql.sql}} script that generate the following tables: {code:java} ij(CONNECTION1)> show tables; TABLE_SCHEM |TABLE_NAME |REMARKS ------------------------------------------------------------------------ SYS |SYSALIASES | SYS |SYSCHECKS | SYS |SYSCOLPERMS | SYS |SYSCOLUMNS | SYS |SYSCONGLOMERATES | SYS |SYSCONSTRAINTS | SYS |SYSDEPENDS | SYS |SYSFILES | SYS |SYSFOREIGNKEYS | SYS |SYSKEYS | SYS |SYSPERMS | SYS |SYSROLES | SYS |SYSROUTINEPERMS | SYS |SYSSCHEMAS | SYS |SYSSEQUENCES | SYS |SYSSTATEMENTS | SYS |SYSSTATISTICS | SYS |SYSTABLEPERMS | SYS |SYSTABLES | SYS |SYSTRIGGERS | SYS |SYSUSERS | SYS |SYSVIEWS | SYSIBM |SYSDUMMY1 | APP |AUX_TABLE | APP |BUCKETING_COLS | APP |CDS | APP |COLUMNS | APP |COLUMNS_V2 | APP |COMPACTION_QUEUE | APP |COMPLETED_COMPACTIONS | APP |COMPLETED_TXN_COMPONENTS | APP |DATABASE_PARAMS | APP |DBS | APP |DB_PRIVS | APP |DELEGATION_TOKENS | APP |FUNCS | APP |FUNC_RU | APP |GLOBAL_PRIVS | APP |HIVE_LOCKS | APP |IDXS | APP |INDEX_PARAMS | APP |KEY_CONSTRAINTS | APP |MASTER_KEYS | APP |NEXT_COMPACTION_QUEUE_ID | APP |NEXT_LOCK_ID | APP |NEXT_TXN_ID | APP |NOTIFICATION_LOG | APP |NOTIFICATION_SEQUENCE | APP |NUCLEUS_TABLES | APP |PARTITIONS | APP |PARTITION_EVENTS | APP |PARTITION_KEYS | APP |PARTITION_KEY_VALS | APP |PARTITION_PARAMS | APP |PART_COL_PRIVS | APP |PART_COL_STATS | APP |PART_PRIVS | APP |ROLES | APP |ROLE_MAP | APP |SDS | APP |SD_PARAMS | APP |SEQUENCE_TABLE | APP |SERDES | APP |SERDE_PARAMS | APP |SKEWED_COL_NAMES | APP |SKEWED_COL_VALUE_LOC_MAP | APP |SKEWED_STRING_LIST | APP |SKEWED_STRING_LIST_VALUES | APP |SKEWED_VALUES | APP |SORT_COLS | APP |TABLE_PARAMS | APP |TAB_COL_STATS | APP |TBLS | APP |TBL_COL_PRIVS | APP |TBL_PRIVS | APP |TXNS | APP |TXN_COMPONENTS | APP |TYPES | APP |TYPE_FIELDS | APP |VERSION | APP |WRITE_SET | 81 rows selected {code} As a result we could see that property that was used in the {{hive-site.xml}} doesn’t work properly and we should avoid using her. That is we must use only {{schematool}}. {code:java} <property> <name>datanucleus.schema.autoCreateAll</name> <value>true</value> <description>creates necessary schema on a startup if one doesnt exist.</description> </property> {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)