Swapping out the iceberg-aws-bundle for the very latest aws provided sdk ('software.amazon.awssdk:bundle:2.25.23') produces an incompatibility from a slightly different code path:
java.lang.NoSuchMethodError: 'void org.apache.hadoop.util.SemaphoredDelegatingExecutor.<init>(java.util.concurrent.ExecutorService, int, boolean, org.apache.hadoop.fs.statistics.DurationTrackerFactory)' at org.apache.hadoop.fs.s3a.S3AFileSystem.executeOpen(S3AFileSystem.java:1767) at org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:1717) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:976) at org.apache.parquet.hadoop.util.HadoopInputFile.newStream(HadoopInputFile.java:69) at org.apache.parquet.hadoop.ParquetFileReader.<init>(ParquetFileReader.java:774) at org.apache.parquet.hadoop.ParquetFileReader.open(ParquetFileReader.java:658) at org.apache.spark.sql.execution.datasources.parquet.ParquetFooterReader.readFooter(ParquetFooterReader.java:53) at org.apache.spark.sql.execution.datasources.parquet.ParquetFooterReader.readFooter(ParquetFooterReader.java:44) at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$.$anonfun$readParquetFootersInParallel$1(ParquetFileFormat.scala:429) ________________________________ From: Oxlade, Dan <dan.oxl...@troweprice.com.INVALID> Sent: 03 April 2024 14:33 To: Aaron Grubb <aa...@kaden.ai>; user@spark.apache.org <user@spark.apache.org> Subject: Re: [EXTERNAL] Re: [Spark]: Spark / Iceberg / hadoop-aws compatibility matrix [sorry; replying all this time] With hadoop-*-3.3.6 in place of the 3.4.0 below I get java.lang.NoClassDefFoundError: com/amazonaws/AmazonClientException I think that the below iceberg-aws-bundle version supplies the v2 sdk. Dan ________________________________ From: Aaron Grubb <aa...@kaden.ai> Sent: 03 April 2024 13:52 To: user@spark.apache.org <user@spark.apache.org> Subject: [EXTERNAL] Re: [Spark]: Spark / Iceberg / hadoop-aws compatibility matrix Downgrade to hadoop-*:3.3.x, Hadoop 3.4.x is based on the AWS SDK v2 and should probably be considered as breaking for tools that build on < 3.4.0 while using AWS. ________________________________ From: Oxlade, Dan <dan.oxl...@troweprice.com.INVALID> Sent: Wednesday, April 3, 2024 2:41:11 PM To: user@spark.apache.org <user@spark.apache.org> Subject: [Spark]: Spark / Iceberg / hadoop-aws compatibility matrix Hi all, I’ve struggled with this for quite some time. My requirement is to read a parquet file from s3 to a Dataframe then append to an existing iceberg table. In order to read the parquet I need the hadoop-aws dependency for s3a:// . In order to write to iceberg I need the iceberg dependency. Both of these dependencies have a transitive dependency on the aws SDK. I can’t find versions for Spark 3.4 that work together. Current Versions: Spark 3.4.1 iceberg-spark-runtime-3.4-2.12:1.4.1 iceberg-aws-bundle:1.4.1 hadoop-aws:3.4.0 hadoop-common:3.4.0 I’ve tried a number of combinations of the above and their respective versions but all fall over with their assumptions on the aws sdk version with class not found exceptions or method not found etc. Is there a compatibility matrix somewhere that someone could point me to? Thanks Dan T. Rowe Price International Ltd (registered number 3957748) is registered in England and Wales with its registered office at Warwick Court, 5 Paternoster Square, London EC4M 7DX. T. Rowe Price International Ltd is authorised and regulated by the Financial Conduct Authority. The company has a branch in Dubai International Financial Centre (regulated by the DFSA as a Representative Office). T. Rowe Price (including T. Rowe Price International Ltd and its affiliates) and its associates do not provide legal or tax advice. Any tax-related discussion contained in this e-mail, including any attachments, is not intended or written to be used, and cannot be used, for the purpose of (i) avoiding any tax penalties or (ii) promoting, marketing, or recommending to any other party any transaction or matter addressed herein. Please consult your independent legal counsel and/or professional tax advisor regarding any legal or tax issues raised in this e-mail. The contents of this e-mail and any attachments are intended solely for the use of the named addressee(s) and may contain confidential and/or privileged information. Any unauthorized use, copying, disclosure, or distribution of the contents of this e-mail is strictly prohibited by the sender and may be unlawful. If you are not the intended recipient, please notify the sender immediately and delete this e-mail. T. Rowe Price International Ltd (registered number 3957748) is registered in England and Wales with its registered office at Warwick Court, 5 Paternoster Square, London EC4M 7DX. T. Rowe Price International Ltd is authorised and regulated by the Financial Conduct Authority. The company has a branch in Dubai International Financial Centre (regulated by the DFSA as a Representative Office). T. Rowe Price (including T. Rowe Price International Ltd and its affiliates) and its associates do not provide legal or tax advice. Any tax-related discussion contained in this e-mail, including any attachments, is not intended or written to be used, and cannot be used, for the purpose of (i) avoiding any tax penalties or (ii) promoting, marketing, or recommending to any other party any transaction or matter addressed herein. Please consult your independent legal counsel and/or professional tax advisor regarding any legal or tax issues raised in this e-mail. The contents of this e-mail and any attachments are intended solely for the use of the named addressee(s) and may contain confidential and/or privileged information. Any unauthorized use, copying, disclosure, or distribution of the contents of this e-mail is strictly prohibited by the sender and may be unlawful. If you are not the intended recipient, please notify the sender immediately and delete this e-mail. T. Rowe Price International Ltd (registered number 3957748) is registered in England and Wales with its registered office at Warwick Court, 5 Paternoster Square, London EC4M 7DX. T. Rowe Price International Ltd is authorised and regulated by the Financial Conduct Authority. The company has a branch in Dubai International Financial Centre (regulated by the DFSA as a Representative Office). T. Rowe Price (including T. Rowe Price International Ltd and its affiliates) and its associates do not provide legal or tax advice. Any tax-related discussion contained in this e-mail, including any attachments, is not intended or written to be used, and cannot be used, for the purpose of (i) avoiding any tax penalties or (ii) promoting, marketing, or recommending to any other party any transaction or matter addressed herein. Please consult your independent legal counsel and/or professional tax advisor regarding any legal or tax issues raised in this e-mail. The contents of this e-mail and any attachments are intended solely for the use of the named addressee(s) and may contain confidential and/or privileged information. Any unauthorized use, copying, disclosure, or distribution of the contents of this e-mail is strictly prohibited by the sender and may be unlawful. If you are not the intended recipient, please notify the sender immediately and delete this e-mail.