[ https://issues.apache.org/jira/browse/HIVE-25303?focusedWorklogId=642935&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-642935 ]
ASF GitHub Bot logged work on HIVE-25303: ----------------------------------------- Author: ASF GitHub Bot Created on: 27/Aug/21 18:02 Start Date: 27/Aug/21 18:02 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on a change in pull request #2442: URL: https://github.com/apache/hive/pull/2442#discussion_r697614262 ########## File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift ########## @@ -2427,7 +2427,7 @@ service ThriftHiveMetastore extends fb303.FacebookService throws(1:NoSuchObjectException o1, 2:MetaException o2) void add_check_constraint(1:AddCheckConstraintRequest req) throws(1:NoSuchObjectException o1, 2:MetaException o2) - + Table ctas_query_dryrun(1:Table tbl) throws(1:AlreadyExistsException o1, 2:InvalidObjectException o2, 3:MetaException o3, 4:NoSuchObjectException o4) Review comment: this rpc call is there to run the metastore side `translator` - might be better to not dedicate the method call to "ctas" ########## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetastoreDefaultTransformer.java ########## @@ -632,37 +631,29 @@ public Table transformCreateTable(Table table, List<String> processorCapabilitie throw new MetaException("Database " + dbName + " for table " + table.getTableName() + " could not be found"); } - if (TableType.MANAGED_TABLE.name().equals(tableType)) { + if (TableType.MANAGED_TABLE.name().equals(tableType)) { LOG.debug("Table is a MANAGED_TABLE"); txnal = params.get(TABLE_IS_TRANSACTIONAL); txn_properties = params.get(TABLE_TRANSACTIONAL_PROPERTIES); - boolean ctas = Boolean.valueOf(params.getOrDefault(TABLE_IS_CTAS, "false")); isInsertAcid = (txn_properties != null && txn_properties.equalsIgnoreCase("insert_only")); if ((txnal == null || txnal.equalsIgnoreCase("FALSE")) && !isInsertAcid) { // non-ACID MANAGED TABLE - if (ctas) { - LOG.info("Not Converting CTAS table " + newTable.getTableName() + " to EXTERNAL tableType for " + processorId); - } else { - LOG.info("Converting " + newTable.getTableName() + " to EXTERNAL tableType for " + processorId); - newTable.setTableType(TableType.EXTERNAL_TABLE.toString()); - params.remove(TABLE_IS_TRANSACTIONAL); - params.remove(TABLE_TRANSACTIONAL_PROPERTIES); - params.put("EXTERNAL", "TRUE"); - params.put(EXTERNAL_TABLE_PURGE, "TRUE"); - params.put("TRANSLATED_TO_EXTERNAL", "TRUE"); - newTable.setParameters(params); - LOG.info("Modified table params are:" + params.toString()); - - if (getLocation(table) == null) { - try { - Path location = getTranslatedToExternalTableDefaultLocation(db, newTable); - newTable.getSd().setLocation(location.toString()); - } catch (Exception e) { - throw new MetaException("Exception determining external table location:" + e.getMessage()); - } - } else { - // table with explicitly set location - // has "translated" properties and will be removed on drop - // should we check tbl directory existence? + LOG.info("Converting " + newTable.getTableName() + " to EXTERNAL tableType for " + processorId); + newTable.setTableType(TableType.EXTERNAL_TABLE.toString()); + params.remove(TABLE_IS_TRANSACTIONAL); + params.remove(TABLE_TRANSACTIONAL_PROPERTIES); + params.put("EXTERNAL", "TRUE"); + params.put(EXTERNAL_TABLE_PURGE, "TRUE"); + params.put("TRANSLATED_TO_EXTERNAL", "TRUE"); + newTable.setParameters(params); + LOG.info("Modified table params are:" + params.toString()); + + if (!table.isSetSd() || table.getSd().getLocation() == null) { + try { + Path newPath = hmsHandler.getWh().getDefaultTablePath(db, table.getTableName(), true); Review comment: one of its tests are broken: http://ci.hive.apache.org/job/hive-precommit/job/PR-2442/11/testReport/org.apache.hadoop.hive.cli/TestNegativeLlapLocalCliDriver/Testing___split_18___PostProcess___testCliDriver_translated_external_rename_/ ########## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetastoreDefaultTransformer.java ########## @@ -632,37 +631,29 @@ public Table transformCreateTable(Table table, List<String> processorCapabilitie throw new MetaException("Database " + dbName + " for table " + table.getTableName() + " could not be found"); } - if (TableType.MANAGED_TABLE.name().equals(tableType)) { + if (TableType.MANAGED_TABLE.name().equals(tableType)) { LOG.debug("Table is a MANAGED_TABLE"); txnal = params.get(TABLE_IS_TRANSACTIONAL); txn_properties = params.get(TABLE_TRANSACTIONAL_PROPERTIES); - boolean ctas = Boolean.valueOf(params.getOrDefault(TABLE_IS_CTAS, "false")); isInsertAcid = (txn_properties != null && txn_properties.equalsIgnoreCase("insert_only")); if ((txnal == null || txnal.equalsIgnoreCase("FALSE")) && !isInsertAcid) { // non-ACID MANAGED TABLE - if (ctas) { - LOG.info("Not Converting CTAS table " + newTable.getTableName() + " to EXTERNAL tableType for " + processorId); - } else { - LOG.info("Converting " + newTable.getTableName() + " to EXTERNAL tableType for " + processorId); - newTable.setTableType(TableType.EXTERNAL_TABLE.toString()); - params.remove(TABLE_IS_TRANSACTIONAL); - params.remove(TABLE_TRANSACTIONAL_PROPERTIES); - params.put("EXTERNAL", "TRUE"); - params.put(EXTERNAL_TABLE_PURGE, "TRUE"); - params.put("TRANSLATED_TO_EXTERNAL", "TRUE"); - newTable.setParameters(params); - LOG.info("Modified table params are:" + params.toString()); - - if (getLocation(table) == null) { - try { - Path location = getTranslatedToExternalTableDefaultLocation(db, newTable); - newTable.getSd().setLocation(location.toString()); - } catch (Exception e) { - throw new MetaException("Exception determining external table location:" + e.getMessage()); - } - } else { - // table with explicitly set location - // has "translated" properties and will be removed on drop - // should we check tbl directory existence? + LOG.info("Converting " + newTable.getTableName() + " to EXTERNAL tableType for " + processorId); + newTable.setTableType(TableType.EXTERNAL_TABLE.toString()); + params.remove(TABLE_IS_TRANSACTIONAL); + params.remove(TABLE_TRANSACTIONAL_PROPERTIES); + params.put("EXTERNAL", "TRUE"); + params.put(EXTERNAL_TABLE_PURGE, "TRUE"); + params.put("TRANSLATED_TO_EXTERNAL", "TRUE"); + newTable.setParameters(params); + LOG.info("Modified table params are:" + params.toString()); + + if (!table.isSetSd() || table.getSd().getLocation() == null) { + try { + Path newPath = hmsHandler.getWh().getDefaultTablePath(db, table.getTableName(), true); Review comment: this seem to silently removing HIVE-24920 changes ########## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java ########## @@ -472,6 +474,32 @@ private void setLoadFileLocation( loc = cmv.getLocation(); } Path location = (loc == null) ? getDefaultCtasLocation(pCtx) : new Path(loc); + boolean isExternal = false; + boolean isAcid = false; + if (pCtx.getQueryProperties().isCTAS()) { + isExternal = pCtx.getCreateTable().isExternal(); + isAcid = pCtx.getCreateTable().getTblProps().getOrDefault( + hive_metastoreConstants.TABLE_IS_TRANSACTIONAL, "false").equalsIgnoreCase("true") || + pCtx.getCreateTable().getTblProps().containsKey(hive_metastoreConstants.TABLE_TRANSACTIONAL_PROPERTIES); + if ((HiveConf.getBoolVar(conf, HiveConf.ConfVars.CREATE_TABLE_AS_EXTERNAL) || (isExternal || !isAcid))) { Review comment: I don't understand all these conditionals; shouldn't we just run it in all cases? the most odd is that we will run the inner block when `CREATE_TABLE_AS_EXTERNAL`... doesn't the transformer will also interrogate acid properties on the table/etc? nit: there seems to be some unneccessary parenhesis in this conditional ########## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java ########## @@ -472,6 +474,32 @@ private void setLoadFileLocation( loc = cmv.getLocation(); } Path location = (loc == null) ? getDefaultCtasLocation(pCtx) : new Path(loc); + boolean isExternal = false; + boolean isAcid = false; + if (pCtx.getQueryProperties().isCTAS()) { + isExternal = pCtx.getCreateTable().isExternal(); + isAcid = pCtx.getCreateTable().getTblProps().getOrDefault( + hive_metastoreConstants.TABLE_IS_TRANSACTIONAL, "false").equalsIgnoreCase("true") || + pCtx.getCreateTable().getTblProps().containsKey(hive_metastoreConstants.TABLE_TRANSACTIONAL_PROPERTIES); + if ((HiveConf.getBoolVar(conf, HiveConf.ConfVars.CREATE_TABLE_AS_EXTERNAL) || (isExternal || !isAcid))) { + CreateTableDesc ctd = pCtx.getCreateTable(); + ctd.getTblProps().put(hive_metastoreConstants.TABLE_IS_TRANSACTIONAL, "false"); // create as external table Review comment: I simply running the transformer and not care about other things may make things more straightforward ########## File path: ql/src/test/queries/clientpositive/ctas.q ########## @@ -73,3 +73,13 @@ select k, value from acid_ctas_part; explain formatted select k, value from acid_ctas_part; + +-- CTAS with external legacy config Review comment: afaik the translator is disabled in the qtests by default you could look at `translated_external_rename2.q` for an example how to enable it and could you put it into a new test? its easier to debug/fix/etc if it breaks ########## File path: ql/src/test/results/clientpositive/llap/ctas.q.out ########## @@ -160,7 +163,7 @@ STAGE PLANS: Move Operator files: hdfs directory: true -#### A masked pattern was here #### + destination: hdfs://### HDFS PATH ### Review comment: unfortunately we don't see the real path...the masking logic is a bit overzealous in replacing stuff; but the fact that the masking logic have changed might mean that there must be some difference in this case - you should look into the other files which are not masked and see what have changed; I think there will be some important differences there ########## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java ########## @@ -472,6 +474,32 @@ private void setLoadFileLocation( loc = cmv.getLocation(); } Path location = (loc == null) ? getDefaultCtasLocation(pCtx) : new Path(loc); + boolean isExternal = false; + boolean isAcid = false; + if (pCtx.getQueryProperties().isCTAS()) { + isExternal = pCtx.getCreateTable().isExternal(); + isAcid = pCtx.getCreateTable().getTblProps().getOrDefault( + hive_metastoreConstants.TABLE_IS_TRANSACTIONAL, "false").equalsIgnoreCase("true") || + pCtx.getCreateTable().getTblProps().containsKey(hive_metastoreConstants.TABLE_TRANSACTIONAL_PROPERTIES); + if ((HiveConf.getBoolVar(conf, HiveConf.ConfVars.CREATE_TABLE_AS_EXTERNAL) || (isExternal || !isAcid))) { + CreateTableDesc ctd = pCtx.getCreateTable(); + ctd.getTblProps().put(hive_metastoreConstants.TABLE_IS_TRANSACTIONAL, "false"); // create as external table + try { + Table table = ctd.toTable(conf); + table = db.getCTASQueryDryrun(table.getTTable()); + org.apache.hadoop.hive.metastore.api.Table tTable = table.getTTable(); + if (tTable.getSd() != null && tTable.getSd().getLocation() != null) { + location = new Path(tTable.getSd().getLocation()); + ctd.setLocation(location.toString()); + } + ctd.setExternal(TableType.EXTERNAL_TABLE.toString().equals(tTable.getTableType())); + ctd.setTblProps(tTable.getParameters()); Review comment: you could probably move these things into a method like `ctd.fromTable(table)` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 642935) Time Spent: 1.5h (was: 1h 20m) > CTAS hive.create.as.external.legacy tries to place data files in managed WH > path > -------------------------------------------------------------------------------- > > Key: HIVE-25303 > URL: https://issues.apache.org/jira/browse/HIVE-25303 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Standalone Metastore > Reporter: Sai Hemanth Gantasala > Assignee: Sai Hemanth Gantasala > Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > Under legacy table creation mode (hive.create.as.external.legacy=true), when > a database has been created in a specific LOCATION, in a session where that > database is Used, tables are created using the following command: > {code:java} > CREATE TABLE <tablename> AS SELECT <select statement>{code} > should inherit the HDFS path from the database's location. Instead, Hive is > trying to write the table data into > /warehouse/tablespace/managed/hive/<database_directory_name>/<table_name> > +Design+: > In the CTAS query, first data is written in the target directory (which > happens in HS2) and then the table is created(This happens in HMS). So here > two decisions are being made i) target directory location ii) how the table > should be created (table type, sd e.t.c). > When HS2 needs a target location that needs to be set, it'll make create > table dry run call to HMS (where table translation happens) and i) and ii) > decisions are made within HMS and returns table object. Then HS2 will use > this location set by HMS for placing the data. -- This message was sent by Atlassian Jira (v8.3.4#803005)