[ https://issues.apache.org/jira/browse/HIVE-26789?focusedWorklogId=829875&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-829875 ]
ASF GitHub Bot logged work on HIVE-26789: ----------------------------------------- Author: ASF GitHub Bot Created on: 29/Nov/22 20:14 Start Date: 29/Nov/22 20:14 Worklog Time Spent: 10m Work Description: cnauroth commented on code in PR #3813: URL: https://github.com/apache/hive/pull/3813#discussion_r1035232269 ########## service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java: ########## @@ -328,7 +328,8 @@ public Object run() throws HiveSQLException { if (!embedded) { LogUtils.registerLoggingContext(queryState.getConf()); } - ShimLoader.getHadoopShims().setHadoopQueryContext(queryState.getQueryId()); + ShimLoader.getHadoopShims() + .setHadoopQueryContext(queryState.getQueryId() + " User:" + parentSessionState.getUserName()); Review Comment: For consistency with the other call sites setting query context, please add a space after the colon: ``` ... + " User: " + ...` ``` (However, also see my other comment about whether or not we should use spaces. Whatever is decided for the format should be consistent at all call sites.) ########## service/src/java/org/apache/hive/service/cli/operation/Operation.java: ########## @@ -237,7 +238,9 @@ protected void createOperationLog() { * Set up some preconditions, or configurations. */ protected void beforeRun() { - ShimLoader.getHadoopShims().setHadoopQueryContext(queryState.getQueryId()); + CallerContext.setCurrent(new CallerContext.Builder("Check").build()); Review Comment: Should this line be removed? Unless I'm mistaken, the call to `setHadoopQueryContext` on the next line will overwrite the value set by this line. ########## cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java: ########## @@ -250,7 +250,8 @@ CommandProcessorResponse processLocalCmd(String cmd, CommandProcessor proc, CliS } // Set HDFS CallerContext to queryId and reset back to sessionId after the query is done - ShimLoader.getHadoopShims().setHadoopQueryContext(qp.getQueryState().getQueryId()); + ShimLoader.getHadoopShims() + .setHadoopQueryContext(qp.getQueryState().getQueryId() + " User: " + ss.getUserName()); Review Comment: I wonder if we should avoid embedding spaces in the format. Prior usage of caller context that I've seen uses an underscore-delimited format. The Hadoop compatibility guidelines state that the HDFS audit log format should be considered public and stable: https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output Embedding spaces could break existing scripts that perform positional parsing using utilities like `cut` and `awk`. Issue Time Tracking ------------------- Worklog Id: (was: 829875) Time Spent: 0.5h (was: 20m) > Add UserName in CallerContext for queries > ----------------------------------------- > > Key: HIVE-26789 > URL: https://issues.apache.org/jira/browse/HIVE-26789 > Project: Hive > Issue Type: Improvement > Reporter: Ayush Saxena > Assignee: Ayush Saxena > Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > HDFS Audit logs if impersonation is false, tracks only the Hive user in the > audit log, Can pass the actual user as part of the CallerContext, so that can > be logged as well for better tracking -- This message was sent by Atlassian Jira (v8.20.10#820010)