[ 
https://issues.apache.org/jira/browse/HIVE-26789?focusedWorklogId=829875&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-829875
 ]

ASF GitHub Bot logged work on HIVE-26789:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 29/Nov/22 20:14
            Start Date: 29/Nov/22 20:14
    Worklog Time Spent: 10m 
      Work Description: cnauroth commented on code in PR #3813:
URL: https://github.com/apache/hive/pull/3813#discussion_r1035232269


##########
service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java:
##########
@@ -328,7 +328,8 @@ public Object run() throws HiveSQLException {
           if (!embedded) {
             LogUtils.registerLoggingContext(queryState.getConf());
           }
-          
ShimLoader.getHadoopShims().setHadoopQueryContext(queryState.getQueryId());
+          ShimLoader.getHadoopShims()
+              .setHadoopQueryContext(queryState.getQueryId() + " User:" + 
parentSessionState.getUserName());

Review Comment:
   For consistency with the other call sites setting query context, please add 
a space after the colon:
   
   ```
   ... + " User: " + ...`
   ```
   
   (However, also see my other comment about whether or not we should use 
spaces. Whatever is decided for the format should be consistent at all call 
sites.)



##########
service/src/java/org/apache/hive/service/cli/operation/Operation.java:
##########
@@ -237,7 +238,9 @@ protected void createOperationLog() {
    * Set up some preconditions, or configurations.
    */
   protected void beforeRun() {
-    ShimLoader.getHadoopShims().setHadoopQueryContext(queryState.getQueryId());
+    CallerContext.setCurrent(new CallerContext.Builder("Check").build());

Review Comment:
   Should this line be removed? Unless I'm mistaken, the call to 
`setHadoopQueryContext` on the next line will overwrite the value set by this 
line.



##########
cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java:
##########
@@ -250,7 +250,8 @@ CommandProcessorResponse processLocalCmd(String cmd, 
CommandProcessor proc, CliS
         }
 
         // Set HDFS CallerContext to queryId and reset back to sessionId after 
the query is done
-        
ShimLoader.getHadoopShims().setHadoopQueryContext(qp.getQueryState().getQueryId());
+        ShimLoader.getHadoopShims()
+            .setHadoopQueryContext(qp.getQueryState().getQueryId() + " User: " 
+ ss.getUserName());

Review Comment:
   I wonder if we should avoid embedding spaces in the format. Prior usage of 
caller context that I've seen uses an underscore-delimited format. The Hadoop 
compatibility guidelines state that the HDFS audit log format should be 
considered public and stable:
   
   
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output
   
   Embedding spaces could break existing scripts that perform positional 
parsing using utilities like `cut` and `awk`.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 829875)
    Time Spent: 0.5h  (was: 20m)

> Add UserName in CallerContext for queries
> -----------------------------------------
>
>                 Key: HIVE-26789
>                 URL: https://issues.apache.org/jira/browse/HIVE-26789
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ayush Saxena
>            Assignee: Ayush Saxena
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HDFS Audit logs if impersonation is false, tracks only the Hive user in the 
> audit log, Can pass the actual user as part of the CallerContext, so that can 
> be logged as well for better tracking



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to