shizhengchao created FLINK-25703: ------------------------------------ Summary: In Batch, I think .stagingPath should be created when the task is running to prevent permission issues caused by inconsistency of hadoop usernames Key: FLINK-25703 URL: https://issues.apache.org/jira/browse/FLINK-25703 Project: Flink Issue Type: Bug Components: Table SQL / Runtime Affects Versions: 1.14.3 Reporter: shizhengchao
If the HADOOP_USER_NAME set on the client side is user_a, but when the task is running in the TaskManager is user_b(this is achievable), then the Batch task will cause permission problems, as follows: {code:java} //代码占位符 Caused by: org.apache.hadoop.security.AccessControlException: Permission denied: user=user_b, access=WRITE, inode="/where/to/whatever/.staging__1642575264185":user_a:supergroup:drwxr-xr-x{code} because the ./staging__1642575264185 is created by user_a。 So we should move the code that creates the .staging directory into the FileSystemOutputFormat#open method : {code:java} //代码占位符 @Override public void open(int taskNumber, int numTasks) throws IOException { try { // check temppath and create it checkOrCreateTmpPath(); PartitionTempFileManager fileManager = new PartitionTempFileManager( fsFactory, tmpPath, taskNumber, CHECKPOINT_ID, outputFileConfig); PartitionWriter.Context<T> context = new PartitionWriter.Context<>(parameters, formatFactory); writer = PartitionWriterFactory.<T>get( partitionColumns.length - staticPartitions.size() > 0, dynamicGrouped, staticPartitions) .create(context, fileManager, computer); } catch (Exception e) { throw new TableException("Exception in open", e); } } private void checkOrCreateTmpPath() throws Exception { FileSystem fs = tmpPath.getFileSystem(); Preconditions.checkState(fs.exists(tmpPath) || fs.mkdirs(tmpPath), "Failed to create staging dir " + tmpPath); } {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)