[ 
https://issues.apache.org/jira/browse/HIVE-25303?focusedWorklogId=642935&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-642935
 ]

ASF GitHub Bot logged work on HIVE-25303:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 27/Aug/21 18:02
            Start Date: 27/Aug/21 18:02
    Worklog Time Spent: 10m 
      Work Description: kgyrtkirk commented on a change in pull request #2442:
URL: https://github.com/apache/hive/pull/2442#discussion_r697614262



##########
File path: 
standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##########
@@ -2427,7 +2427,7 @@ service ThriftHiveMetastore extends fb303.FacebookService
       throws(1:NoSuchObjectException o1, 2:MetaException o2)
   void add_check_constraint(1:AddCheckConstraintRequest req)
       throws(1:NoSuchObjectException o1, 2:MetaException o2)
-
+  Table ctas_query_dryrun(1:Table tbl) throws(1:AlreadyExistsException o1, 
2:InvalidObjectException o2, 3:MetaException o3, 4:NoSuchObjectException o4)

Review comment:
       this rpc call is there to run the metastore side `translator` - might be 
better to not dedicate the method call to "ctas"

##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetastoreDefaultTransformer.java
##########
@@ -632,37 +631,29 @@ public Table transformCreateTable(Table table, 
List<String> processorCapabilitie
       throw new MetaException("Database " + dbName + " for table " + 
table.getTableName() + " could not be found");
     }
 
-    if (TableType.MANAGED_TABLE.name().equals(tableType)) {
+      if (TableType.MANAGED_TABLE.name().equals(tableType)) {
       LOG.debug("Table is a MANAGED_TABLE");
       txnal = params.get(TABLE_IS_TRANSACTIONAL);
       txn_properties = params.get(TABLE_TRANSACTIONAL_PROPERTIES);
-      boolean ctas = Boolean.valueOf(params.getOrDefault(TABLE_IS_CTAS, 
"false"));
       isInsertAcid = (txn_properties != null && 
txn_properties.equalsIgnoreCase("insert_only"));
       if ((txnal == null || txnal.equalsIgnoreCase("FALSE")) && !isInsertAcid) 
{ // non-ACID MANAGED TABLE
-        if (ctas) {
-          LOG.info("Not Converting CTAS table " + newTable.getTableName() + " 
to EXTERNAL tableType for " + processorId);
-        } else {
-          LOG.info("Converting " + newTable.getTableName() + " to EXTERNAL 
tableType for " + processorId);
-          newTable.setTableType(TableType.EXTERNAL_TABLE.toString());
-          params.remove(TABLE_IS_TRANSACTIONAL);
-          params.remove(TABLE_TRANSACTIONAL_PROPERTIES);
-          params.put("EXTERNAL", "TRUE");
-          params.put(EXTERNAL_TABLE_PURGE, "TRUE");
-          params.put("TRANSLATED_TO_EXTERNAL", "TRUE");
-          newTable.setParameters(params);
-          LOG.info("Modified table params are:" + params.toString());
-
-          if (getLocation(table) == null) {
-            try {
-              Path location = getTranslatedToExternalTableDefaultLocation(db, 
newTable);
-              newTable.getSd().setLocation(location.toString());
-            } catch (Exception e) {
-              throw new MetaException("Exception determining external table 
location:" + e.getMessage());
-            }
-          } else {
-            // table with explicitly set location
-            // has "translated" properties and will be removed on drop
-            // should we check tbl directory existence?
+        LOG.info("Converting " + newTable.getTableName() + " to EXTERNAL 
tableType for " + processorId);
+        newTable.setTableType(TableType.EXTERNAL_TABLE.toString());
+        params.remove(TABLE_IS_TRANSACTIONAL);
+        params.remove(TABLE_TRANSACTIONAL_PROPERTIES);
+        params.put("EXTERNAL", "TRUE");
+        params.put(EXTERNAL_TABLE_PURGE, "TRUE");
+        params.put("TRANSLATED_TO_EXTERNAL", "TRUE");
+        newTable.setParameters(params);
+        LOG.info("Modified table params are:" + params.toString());
+
+        if (!table.isSetSd() || table.getSd().getLocation() == null) {
+          try {
+            Path newPath = hmsHandler.getWh().getDefaultTablePath(db, 
table.getTableName(), true);

Review comment:
       one of its tests are broken:
   
http://ci.hive.apache.org/job/hive-precommit/job/PR-2442/11/testReport/org.apache.hadoop.hive.cli/TestNegativeLlapLocalCliDriver/Testing___split_18___PostProcess___testCliDriver_translated_external_rename_/

##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetastoreDefaultTransformer.java
##########
@@ -632,37 +631,29 @@ public Table transformCreateTable(Table table, 
List<String> processorCapabilitie
       throw new MetaException("Database " + dbName + " for table " + 
table.getTableName() + " could not be found");
     }
 
-    if (TableType.MANAGED_TABLE.name().equals(tableType)) {
+      if (TableType.MANAGED_TABLE.name().equals(tableType)) {
       LOG.debug("Table is a MANAGED_TABLE");
       txnal = params.get(TABLE_IS_TRANSACTIONAL);
       txn_properties = params.get(TABLE_TRANSACTIONAL_PROPERTIES);
-      boolean ctas = Boolean.valueOf(params.getOrDefault(TABLE_IS_CTAS, 
"false"));
       isInsertAcid = (txn_properties != null && 
txn_properties.equalsIgnoreCase("insert_only"));
       if ((txnal == null || txnal.equalsIgnoreCase("FALSE")) && !isInsertAcid) 
{ // non-ACID MANAGED TABLE
-        if (ctas) {
-          LOG.info("Not Converting CTAS table " + newTable.getTableName() + " 
to EXTERNAL tableType for " + processorId);
-        } else {
-          LOG.info("Converting " + newTable.getTableName() + " to EXTERNAL 
tableType for " + processorId);
-          newTable.setTableType(TableType.EXTERNAL_TABLE.toString());
-          params.remove(TABLE_IS_TRANSACTIONAL);
-          params.remove(TABLE_TRANSACTIONAL_PROPERTIES);
-          params.put("EXTERNAL", "TRUE");
-          params.put(EXTERNAL_TABLE_PURGE, "TRUE");
-          params.put("TRANSLATED_TO_EXTERNAL", "TRUE");
-          newTable.setParameters(params);
-          LOG.info("Modified table params are:" + params.toString());
-
-          if (getLocation(table) == null) {
-            try {
-              Path location = getTranslatedToExternalTableDefaultLocation(db, 
newTable);
-              newTable.getSd().setLocation(location.toString());
-            } catch (Exception e) {
-              throw new MetaException("Exception determining external table 
location:" + e.getMessage());
-            }
-          } else {
-            // table with explicitly set location
-            // has "translated" properties and will be removed on drop
-            // should we check tbl directory existence?
+        LOG.info("Converting " + newTable.getTableName() + " to EXTERNAL 
tableType for " + processorId);
+        newTable.setTableType(TableType.EXTERNAL_TABLE.toString());
+        params.remove(TABLE_IS_TRANSACTIONAL);
+        params.remove(TABLE_TRANSACTIONAL_PROPERTIES);
+        params.put("EXTERNAL", "TRUE");
+        params.put(EXTERNAL_TABLE_PURGE, "TRUE");
+        params.put("TRANSLATED_TO_EXTERNAL", "TRUE");
+        newTable.setParameters(params);
+        LOG.info("Modified table params are:" + params.toString());
+
+        if (!table.isSetSd() || table.getSd().getLocation() == null) {
+          try {
+            Path newPath = hmsHandler.getWh().getDefaultTablePath(db, 
table.getTableName(), true);

Review comment:
       this seem to silently removing HIVE-24920 changes

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java
##########
@@ -472,6 +474,32 @@ private void setLoadFileLocation(
       loc = cmv.getLocation();
     }
     Path location = (loc == null) ? getDefaultCtasLocation(pCtx) : new 
Path(loc);
+    boolean isExternal = false;
+    boolean isAcid = false;
+    if (pCtx.getQueryProperties().isCTAS()) {
+      isExternal = pCtx.getCreateTable().isExternal();
+      isAcid = pCtx.getCreateTable().getTblProps().getOrDefault(
+              hive_metastoreConstants.TABLE_IS_TRANSACTIONAL, 
"false").equalsIgnoreCase("true") ||
+              
pCtx.getCreateTable().getTblProps().containsKey(hive_metastoreConstants.TABLE_TRANSACTIONAL_PROPERTIES);
+      if ((HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.CREATE_TABLE_AS_EXTERNAL) || (isExternal || !isAcid))) {

Review comment:
       I don't understand all these conditionals; shouldn't we just run it in 
all cases?
   the most odd is that we will run the inner block when 
`CREATE_TABLE_AS_EXTERNAL`...
   doesn't the transformer will also interrogate acid properties on the 
table/etc?
   
   nit: there seems to be some unneccessary parenhesis in this conditional
   
   

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java
##########
@@ -472,6 +474,32 @@ private void setLoadFileLocation(
       loc = cmv.getLocation();
     }
     Path location = (loc == null) ? getDefaultCtasLocation(pCtx) : new 
Path(loc);
+    boolean isExternal = false;
+    boolean isAcid = false;
+    if (pCtx.getQueryProperties().isCTAS()) {
+      isExternal = pCtx.getCreateTable().isExternal();
+      isAcid = pCtx.getCreateTable().getTblProps().getOrDefault(
+              hive_metastoreConstants.TABLE_IS_TRANSACTIONAL, 
"false").equalsIgnoreCase("true") ||
+              
pCtx.getCreateTable().getTblProps().containsKey(hive_metastoreConstants.TABLE_TRANSACTIONAL_PROPERTIES);
+      if ((HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.CREATE_TABLE_AS_EXTERNAL) || (isExternal || !isAcid))) {
+        CreateTableDesc ctd = pCtx.getCreateTable();
+        ctd.getTblProps().put(hive_metastoreConstants.TABLE_IS_TRANSACTIONAL, 
"false"); // create as external table

Review comment:
       I simply running the transformer and not care about other things may 
make things more straightforward

##########
File path: ql/src/test/queries/clientpositive/ctas.q
##########
@@ -73,3 +73,13 @@ select k, value from acid_ctas_part;
 
 explain formatted
 select k, value from acid_ctas_part;
+
+-- CTAS with external legacy config

Review comment:
       afaik the translator is disabled in the qtests by default 
   you could look at `translated_external_rename2.q` for an example how to 
enable it
   and could you put it into a new test? its easier to debug/fix/etc if it 
breaks 

##########
File path: ql/src/test/results/clientpositive/llap/ctas.q.out
##########
@@ -160,7 +163,7 @@ STAGE PLANS:
     Move Operator
       files:
           hdfs directory: true
-#### A masked pattern was here ####
+          destination: hdfs://### HDFS PATH ###

Review comment:
       unfortunately we don't see the real path...the masking logic is a  bit 
overzealous in replacing stuff; but the fact that the masking logic have 
changed might mean that there must be some difference in this case - you should 
look into the other files which are not masked and see what have changed; I 
think there will be some important differences there

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java
##########
@@ -472,6 +474,32 @@ private void setLoadFileLocation(
       loc = cmv.getLocation();
     }
     Path location = (loc == null) ? getDefaultCtasLocation(pCtx) : new 
Path(loc);
+    boolean isExternal = false;
+    boolean isAcid = false;
+    if (pCtx.getQueryProperties().isCTAS()) {
+      isExternal = pCtx.getCreateTable().isExternal();
+      isAcid = pCtx.getCreateTable().getTblProps().getOrDefault(
+              hive_metastoreConstants.TABLE_IS_TRANSACTIONAL, 
"false").equalsIgnoreCase("true") ||
+              
pCtx.getCreateTable().getTblProps().containsKey(hive_metastoreConstants.TABLE_TRANSACTIONAL_PROPERTIES);
+      if ((HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.CREATE_TABLE_AS_EXTERNAL) || (isExternal || !isAcid))) {
+        CreateTableDesc ctd = pCtx.getCreateTable();
+        ctd.getTblProps().put(hive_metastoreConstants.TABLE_IS_TRANSACTIONAL, 
"false"); // create as external table
+        try {
+          Table table = ctd.toTable(conf);
+          table = db.getCTASQueryDryrun(table.getTTable());
+          org.apache.hadoop.hive.metastore.api.Table tTable = 
table.getTTable();
+          if (tTable.getSd() != null  && tTable.getSd().getLocation() != null) 
{
+            location = new Path(tTable.getSd().getLocation());
+            ctd.setLocation(location.toString());
+          }
+          
ctd.setExternal(TableType.EXTERNAL_TABLE.toString().equals(tTable.getTableType()));
+          ctd.setTblProps(tTable.getParameters());

Review comment:
       you could probably move these things into  a method like 
`ctd.fromTable(table)`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 642935)
    Time Spent: 1.5h  (was: 1h 20m)

> CTAS hive.create.as.external.legacy tries to place data files in managed WH 
> path
> --------------------------------------------------------------------------------
>
>                 Key: HIVE-25303
>                 URL: https://issues.apache.org/jira/browse/HIVE-25303
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2, Standalone Metastore
>            Reporter: Sai Hemanth Gantasala
>            Assignee: Sai Hemanth Gantasala
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Under legacy table creation mode (hive.create.as.external.legacy=true), when 
> a database has been created in a specific LOCATION, in a session where that 
> database is Used, tables are created using the following command:
> {code:java}
> CREATE TABLE <tablename> AS SELECT <select statement>{code}
> should inherit the HDFS path from the database's location. Instead, Hive is 
> trying to write the table data into 
> /warehouse/tablespace/managed/hive/<database_directory_name>/<table_name>
> +Design+: 
> In the CTAS query, first data is written in the target directory (which 
> happens in HS2) and then the table is created(This happens in HMS). So here 
> two decisions are being made i) target directory location ii) how the table 
> should be created (table type, sd e.t.c).
> When HS2 needs a target location that needs to be set, it'll make create 
> table dry run call to HMS (where table translation happens) and i) and ii) 
> decisions are made within HMS and returns table object. Then HS2 will use 
> this location set by HMS for placing the data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to