Xun REN created HADOOP-17359:
--------------------------------

             Summary: [Hadoop-Tools]S3A MultiObjectDeleteException after 
uploading a file
                 Key: HADOOP-17359
                 URL: https://issues.apache.org/jira/browse/HADOOP-17359
             Project: Hadoop Common
          Issue Type: Bug
    Affects Versions: 2.10.0
            Reporter: Xun REN


Hello,

 

I am using org.apache.hadoop.fs.s3a.S3AFileSystem as implementation for S3 
related operation.

When I upload a file onto a path, it returns an error:
{code:java}
20/11/05 11:49:13 ERROR s3a.S3AFileSystem: Partial failure of delete, 1 
errors20/11/05 11:49:13 ERROR s3a.S3AFileSystem: Partial failure of delete, 1 
errorscom.amazonaws.services.s3.model.MultiObjectDeleteException: One or more 
objects could not be deleted (Service: null; Status Code: 200; Error Code: 
null; Request ID: 767BEC034D0B5B8A; S3 Extended Request ID: 
JImfJY9hCl/QvninqT9aO+jrkmyRpRcceAg7t1lO936RfOg7izIom76RtpH+5rLqvmBFRx/++g8=; 
Proxy: null), S3 Extended Request ID: 
JImfJY9hCl/QvninqT9aO+jrkmyRpRcceAg7t1lO936RfOg7izIom76RtpH+5rLqvmBFRx/++g8= at 
com.amazonaws.services.s3.AmazonS3Client.deleteObjects(AmazonS3Client.java:2287)
 at 
org.apache.hadoop.fs.s3a.S3AFileSystem.deleteObjects(S3AFileSystem.java:1137) 
at org.apache.hadoop.fs.s3a.S3AFileSystem.removeKeys(S3AFileSystem.java:1389) 
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.deleteUnnecessaryFakeDirectories(S3AFileSystem.java:2304)
 at 
org.apache.hadoop.fs.s3a.S3AFileSystem.finishedWrite(S3AFileSystem.java:2270) 
at 
org.apache.hadoop.fs.s3a.S3AFileSystem$WriteOperationHelper.writeSuccessful(S3AFileSystem.java:2768)
 at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.close(S3ABlockOutputStream.java:371)
 at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:74)
 at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:108) 
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:69) at 
org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:128) at 
org.apache.hadoop.fs.shell.CommandWithDestination$TargetFileSystem.writeStreamToFile(CommandWithDestination.java:488)
 at 
org.apache.hadoop.fs.shell.CommandWithDestination.copyStreamToTarget(CommandWithDestination.java:410)
 at 
org.apache.hadoop.fs.shell.CommandWithDestination.copyFileToTarget(CommandWithDestination.java:342)
 at 
org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:277)
 at 
org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:262)
 at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:327) at 
org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:299) at 
org.apache.hadoop.fs.shell.CommandWithDestination.processPathArgument(CommandWithDestination.java:257)
 at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:281) at 
org.apache.hadoop.fs.shell.Command.processArguments(Command.java:265) at 
org.apache.hadoop.fs.shell.CommandWithDestination.processArguments(CommandWithDestination.java:228)
 at 
org.apache.hadoop.fs.shell.CopyCommands$Put.processArguments(CopyCommands.java:285)
 at 
org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:119) at 
org.apache.hadoop.fs.shell.Command.run(Command.java:175) at 
org.apache.hadoop.fs.FsShell.run(FsShell.java:317) at 
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at 
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at 
org.apache.hadoop.fs.FsShell.main(FsShell.java:380)20/11/05 11:49:13 ERROR 
s3a.S3AFileSystem: bv/: "AccessDenied" - Access Denied
{code}
The problem is that Hadoop tries to create fake directories to map with S3 
prefix and it cleans them after the operation. The cleaning is done from the 
parent folder until the root folder.

If we don't give the corresponding permission for some path, it will encounter 
this problem:

[https://github.com/apache/hadoop/blob/rel/release-2.10.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2296-L2301]

 

During uploading, I don't see any "fake" directories are created. Why should we 
clean them if it is not really created ?

It is the same for the other operations like rename or mkdir where the 
"deleteUnnecessaryFakeDirectories" method is called.

Maybe the solution is to check the deleting permission before it calls the 
deleteObjects method.

 

To reproduce the problem:
 # With a bucket named my_bucket, we have the path s3://my_bucket/a/b/c inside
 # The corresponding user has only permission on the path b and sub-path inside.
 # We do the command "hdfs dfs -mkdir s3://my_bucket/a/b/c/d"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to