Steve Loughran created HADOOP-19388: ---------------------------------------
Summary: S3A: Validate bulk delete through Iceberg HadoopFileIO Key: HADOOP-19388 URL: https://issues.apache.org/jira/browse/HADOOP-19388 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3, test Affects Versions: 3.4.1 Reporter: Steve Loughran Assignee: Steve Loughran Now Hadoop 3.4.1 has shipped we can link up Iceberg to it through reflection: https://github.com/apache/iceberg/pull/10233 However, we can't put a test in there, even just to talk to the minio docker image which S3FileIO tests with, because the tests would only work with hadoop 3.4.1+ Proposed: add a validation test here, initially just with a JAR built from the PR. Initially this just says "it works as expected". However, it will go on to become the regression tests "it still works", so there's no need to wait for test downstream to be run and failures to be reported back. We need a test suite which * Adds a test-time dependency on iceberg JAR with bulk delete through the HadoopFileIO class. * Runs compliance tests, single/multi delete, complex names, directories, missing paths * Parameterized on single/multi delete enables in s3a, iceberg to use/not use bulk delete * includes IOStats assertions to verify bulk delete was actually used. * mixes in some local file:// files to so as to validate multiple stores with different page sizes. I had started this within HADOOP-19385, with iceberg jar one of the formats and the new test module to include the base contract test suite. But as the iceberg JAR is java17+, it rapidly becomes unworkable. Instead, it will all go into s3a with a new java17 profile which will * add iceberg jar dependency * add a new src/test/java17 test source tree. * contain a minimal abstract base test * s3a implementation Once Hadoop is java17 then it can be moved into to the main branch. Note also: until iceberg actually ships with the PR in, this cannot be merged. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org