[ 
https://issues.apache.org/jira/browse/HIVE-28019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17834211#comment-17834211
 ] 

Stamatis Zampetakis commented on HIVE-28019:
--------------------------------------------

The Jira summary and description claims that something is broken/buggy in Hive. 
However, after looking at the proposed fix and the changes caused by the PR I 
get the impression that "broken/buggy" can be a bit subjective. 

The current proposal claims that if the query is an EXPLAIN statement then it 
should be tagged as {{HiveOperation.EXPLAIN}}. Without much context this seems 
like an natural fit but we should take into account that a large number of Hive 
operations can be used in conjunction with EXPLAIN. 

I did some post-processing on the .q.out changes introduced by PR#5022 to gauge 
the impact on existing applications/users.
{noformat}
git diff --word-diff -U0 d73aef569f72d2bd8ecbf8b708e3d55d59c6615d..HEAD | grep 
"HOOK: type" | sed 's/.\+HOOK: type: //' | sort  | uniq -c
      2 [-ABORT TRANSACTIONS-]{+EXPLAIN+}
      2 [-ALTERDATABASE-]{+EXPLAIN+}
      4 [-ALTERDATABASE_LOCATION-]{+EXPLAIN+}
      2 [-ALTERDATABASE_OWNER-]{+EXPLAIN+}
      6 [-ALTER MAPPING-]{+EXPLAIN+}
    142 [-ALTER_MATERIALIZED_VIEW_REBUILD-]{+EXPLAIN+}
     12 [-ALTER_MATERIALIZED_VIEW_REWRITE-]{+EXPLAIN+}
      8 [-ALTER_PARTITION_MERGE-]{+EXPLAIN+}
     10 [-ALTER POOL-]{+EXPLAIN+}
     26 [-ALTER RESOURCEPLAN-]{+EXPLAIN+}
      2 [-ALTERTABLE_ADDCOLS-]{+EXPLAIN+}
      2 [-ALTERTABLE_ADDCONSTRAINT-]{+EXPLAIN+}
     12 [-ALTERTABLE_ADDPARTS-]{+EXPLAIN+}
      2 [-ALTERTABLE_ARCHIVE-]{+EXPLAIN+}
      2 [-ALTERTABLE_BUCKETNUM-]{+EXPLAIN+}
      6 [-ALTERTABLE_CLUSTER_SORT-]{+EXPLAIN+}
     20 [-ALTERTABLE_COMPACT-]{+EXPLAIN+}
     20 [-ALTERTABLE_CONVERT-]{+EXPLAIN+}
     12 [-ALTERTABLE_CREATEBRANCH-]{+EXPLAIN+}
      4 [-ALTERTABLE_CREATETAG-]{+EXPLAIN+}
      4 [-ALTERTABLE_DROPBRANCH-]{+EXPLAIN+}
      2 [-ALTERTABLE_DROPCONSTRAINT-]{+EXPLAIN+}
     12 [-ALTERTABLE_DROPPARTS-]{+EXPLAIN+}
      4 [-ALTERTABLE_DROPTAG-]{+EXPLAIN+}
     10 [-ALTERTABLE_EXCHANGEPARTITION-]{+EXPLAIN+}
      8 [-ALTERTABLE_EXECUTE-]{+EXPLAIN+}
      2 [-ALTERTABLE_FILEFORMAT-]{+EXPLAIN+}
      2 [-ALTERTABLE_LOCATION-]{+EXPLAIN+}
      2 [-ALTER_TABLE_MERGE-]{+EXPLAIN+}
      2 [-ALTERTABLE_OWNER-]{+EXPLAIN+}
      4 [-ALTERTABLE_PARTCOLTYPE-]{+EXPLAIN+}
      4 [-ALTERTABLE_PROPERTIES-]{+EXPLAIN+}
      8 [-ALTERTABLE_RENAMECOL-]{+EXPLAIN+}
     14 [-ALTERTABLE_RENAME-]{+EXPLAIN+}
      4 [-ALTERTABLE_RENAMEPART-]{+EXPLAIN+}
      2 [-ALTERTABLE_REPLACECOLS-]{+EXPLAIN+}
      4 [-ALTERTABLE_SERDEPROPERTIES-]{+EXPLAIN+}
      2 [-ALTERTABLE_SERIALIZER-]{+EXPLAIN+}
      4 [-ALTERTABLE_SKEWED-]{+EXPLAIN+}
      4 [-ALTERTABLE_TOUCH-]{+EXPLAIN+}
      2 [-ALTERTABLE_UNARCHIVE-]{+EXPLAIN+}
      4 [-ALTERTABLE_UPDATECOLUMNS-]{+EXPLAIN+}
      2 [-ALTERTBLPART_SKEWED_LOCATION-]{+EXPLAIN+}
      6 [-ALTER TRIGGER-]{+EXPLAIN+}
     90 [-ANALYZE_TABLE-]{+EXPLAIN+}
      8 [-CREATEDATABASE-]{+EXPLAIN+}
     18 [-CREATEFUNCTION-]{+EXPLAIN+}
      4 [-CREATEMACRO-]{+EXPLAIN+}
      8 [-CREATE MAPPING-]{+EXPLAIN+}
     34 [-CREATE_MATERIALIZED_VIEW-]{+EXPLAIN+}
      6 [-CREATE POOL-]{+EXPLAIN+}
      6 [-CREATE RESOURCEPLAN-]{+EXPLAIN+}
      4 [-CREATEROLE-]{+EXPLAIN+}
    136 [-CREATETABLE_AS_SELECT-]{+EXPLAIN+}
     60 [-CREATETABLE-]{+EXPLAIN+}
      6 [-CREATE TRIGGER-]{+EXPLAIN+}
     30 [-CREATEVIEW-]{+EXPLAIN+}
     10 [-DESCDATABASE-]{+EXPLAIN+}
      4 [-DESCFUNCTION-]{+EXPLAIN+}
      8 [-DESCTABLE-]{+EXPLAIN+}
      2 [-DROPDATABASE-]{+EXPLAIN+}
      6 [-DROPFUNCTION-]{+EXPLAIN+}
      4 [-DROPMACRO-]{+EXPLAIN+}
      8 [-DROP MAPPING-]{+EXPLAIN+}
      6 [-DROP POOL-]{+EXPLAIN+}
      6 [-DROP RESOURCEPLAN-]{+EXPLAIN+}
      4 [-DROPROLE-]{+EXPLAIN+}
      8 [-DROPTABLE-]{+EXPLAIN+}
      6 [-DROP TRIGGER-]{+EXPLAIN+}
      4 [-DROPVIEW-]{+EXPLAIN+}
     28 [-EXECUTE QUERY-]{+EXPLAIN+}
      8 [-GRANT_PRIVILEGE-]{+EXPLAIN+}
      4 [-GRANT_ROLE-]{+EXPLAIN+}
      4 [-KILL QUERY-]{+EXPLAIN+}
     20 [-LOAD-]{+EXPLAIN+}
     22 [-MSCK-]{+EXPLAIN+}
  16626 [-QUERY-]{+EXPLAIN+}
     40 [-QUERY-]{+LOAD+}
      4 [-RELOADFUNCTION-]{+EXPLAIN+}
      8 [-REVOKE_PRIVILEGE-]{+EXPLAIN+}
     12 [-SHOWCOLUMNS-]{+EXPLAIN+}
      2 [-SHOW COMPACTIONS-]{+EXPLAIN+}
      2 [-SHOWCONF-]{+EXPLAIN+}
      2 [-SHOW_CREATEDATABASE-]{+EXPLAIN+}
      2 [-SHOWDATABASES-]{+EXPLAIN+}
      6 [-SHOWFUNCTIONS-]{+EXPLAIN+}
      8 [-SHOW_GRANT-]{+EXPLAIN+}
      4 [-SHOWLOCKS-]{+EXPLAIN+}
     20 [-SHOWMATERIALIZEDVIEWS-]{+EXPLAIN+}
     28 [-SHOWPARTITIONS-]{+EXPLAIN+}
     12 [-SHOW RESOURCEPLAN-]{+EXPLAIN+}
      4 [-SHOW_ROLE_GRANT-]{+EXPLAIN+}
     24 [-SHOWTABLES-]{+EXPLAIN+}
      2 [-SHOW_TABLESTATUS-]{+EXPLAIN+}
      2 [-SHOW TRANSACTIONS-]{+EXPLAIN+}
     10 [-SHOWVIEWS-]{+EXPLAIN+}
     14 [-SWITCHDATABASE-]{+EXPLAIN+}
     30 [-TRUNCATETABLE-]{+EXPLAIN+}
{noformat}

There are lots of SQL queries/operations that are now classified as EXPLAIN 
queries. From some people this could be considered an improvement but for 
others this is loss of information as we no longer have the original type.

Moreover, the {{HiveOperation}} class was originally introduced for 
authorization purposes (HIVE-78). Till now this is its first and most prominent 
use in Hive's production code. Given that we don't want to change the 
authorization implication of the operations outlined above (even when we are 
running an EXPLAIN) means that we shouldn't change the query type altogether.

Based on the description in the PR#5022 I understand that some users of the 
{{HiveProtoLoggingHook}} would like to have the information about if a query is 
an EXPLAIN or not. If that's the main motivation behind the creation of this 
ticket then maybe we should focus on this aspect. Instead of "dropping" 
existing information and changing the query type we could opt to enrich the 
proto output or whatever else is needed with additional context about if the 
SQL statement is an EXPLAIN.

> Fix query type information in proto files for load and explain queries
> ----------------------------------------------------------------------
>
>                 Key: HIVE-28019
>                 URL: https://issues.apache.org/jira/browse/HIVE-28019
>             Project: Hive
>          Issue Type: Task
>          Components: HiveServer2
>            Reporter: Ramesh Kumar Thangarajan
>            Assignee: Ramesh Kumar Thangarajan
>            Priority: Major
>              Labels: pull-request-available
>
> Certain query types like LOAD, export, import and explain queries did not 
> produce the right Hive operation type



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to