What about JDK 8? If I remember Spark 2 was holding us, do we want to consider 
switching to JDK 11 for releases?

- Anton

> On Apr 20, 2023, at 2:10 AM, Driesprong, Fokko <fo...@driesprong.frl> wrote:
> 
> Thanks all for the response, much appreciated.
> 
> That said, I'd love to hear from more people on this. I think it would be 
> great to drop support, but I don't know how many people still use it. Is 
> upgrading Hadoop a good reason to drop support for an engine? Hadoop seems 
> like a minor concern to me unless it is blocking something.
> 
> I noticed that we needed to bump Hadoop when we wanted to upgrade to Parquet 
> 1.13.0 <https://github.com/apache/iceberg/pull/7301>. It would be nice to get 
> this in since it allows for removing a workaround from the Iceberg codebase 
> (see PR for details).
> 
> Netflix is still on Spark-2.4.4 with Iceberg-0.9. We are actively migrating 
> to Spark-3.x and Iceberg 1.1 (or later). I do not anticipate us using 
> Spark-2.4.4 with newer versions of Iceberg (>0.9). 
> 
> For Spark 2.4 Iceberg up to 1.2.1 is available: 
> https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-spark-2.4 
> <https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-spark-2.4>
> 
> As for the Hadoop upgrade, I think that could be problematic for us if 
> there's any non-backwards compatible API change required at compile time 
> since we're still running a 2.8.x version.
> 
> Thanks for raising this. I took some time today to dig into this. There is an 
> effort to upgrade Hadoop <https://github.com/apache/iceberg/pull/5024> in 
> Iceberg, but that's stuck on incompatibilities with Tez. Unfortunately, 
> Parquet 1.13.0 
> <https://github.com/apache/iceberg/actions/runs/4740904793/jobs/8417296190?pr=7301>
>  doesn't compile against Hadoop 2.8.5 and also bringing back support Hadoop 
> 2.8.x is going to be hard <https://github.com/apache/parquet-mr/pull/1075>. 
> For Parquet, I've created a PR to run the CI against Hadoop 2.9.2 
> <https://github.com/apache/parquet-mr/pull/1076> so we know when we're 
> breaking compatibility.
> 
> TLDR: It looks like if we want to upgrade Parquet, and other libraries in the 
> future, we need to drop Hadoop 2. I'm hesitant to do that right now because 
> we might exclude users that are still on older versions of Hadoop (such as 
> Airbnb). Spark has announced that Spark 3.5 Hadoop 2 will be dropped 
> <https://lists.apache.org/thread/vr6bx2bmkgo4mhdspjm9g29h2c3lmrrz>. I'll 
> create a PR for removing Spark 2.4 shortly because I see a consensus for 
> removing that.
> 
> Kind regards,
> Fokko
> 
> Op wo 19 apr 2023 om 19:02 schreef Anton Okolnychyi 
> <aokolnyc...@apple.com.invalid>:
> Yes, yes, yes!
> 
> - Anton
> 
>> On Apr 19, 2023, at 8:17 AM, Ryan Blue <b...@tabular.io 
>> <mailto:b...@tabular.io>> wrote:
>> 
>> Sounds like we have consensus for removing Spark 2.4.
>> 
>> Thanks, everyone!
>> 
>> On Wed, Apr 19, 2023 at 12:36 AM Ajantha Bhat <ajanthab...@gmail.com 
>> <mailto:ajanthab...@gmail.com>> wrote:
>> +1, 
>> Spark-2.4 has reached EOL 
>> (https://lists.apache.org/thread/tdk7r5gx3nwrds3fg7qmp5h2jnqgc6tb 
>> <https://lists.apache.org/thread/tdk7r5gx3nwrds3fg7qmp5h2jnqgc6tb> and 
>> https://spark.apache.org/versioning-policy.html 
>> <https://spark.apache.org/versioning-policy.html>) 
>> 
>> Thanks, 
>> Ajantha
>> 
>> On Wed, Apr 19, 2023 at 3:52 AM Edgar Rodriguez 
>> <edgar.rodrig...@airbnb.com.invalid 
>> <mailto:edgar.rodrig...@airbnb.com.invalid>> wrote:
>> I'm generally +1 on dropping Spark 2.4 - mostly everyone is moving to Spark 
>> 3.x, if not already moved.
>> 
>> As for the Hadoop upgrade, I think that could be problematic for us if 
>> there's any non-backwards compatible API change required at compile time 
>> since we're still running a 2.8.x version.
>> 
>> Cheers,
>> 
>> On Mon, Apr 17, 2023 at 3:50 PM Steve Zhang <hongyue_zh...@apple.com.invalid 
>> <mailto:hongyue_zh...@apple.com.invalid>> wrote:
>> +1 for dropping Spark 2.4 support and we can clean up doc as well such as 
>> https://iceberg.apache.org/docs/latest/spark-queries/#spark-24 
>> <https://iceberg.apache.org/docs/latest/spark-queries/#spark-24>
>> 
>> Thanks,
>> Steve Zhang
>> 
>> 
>> 
>>> On Apr 13, 2023, at 12:53 PM, Jack Ye <yezhao...@gmail.com 
>>> <mailto:yezhao...@gmail.com>> wrote:
>>> 
>>> +1 for dropping 2.4 support
>>> 
>> 
>> 
>> 
>> -- 
>> Edgar R
>> Data Warehouse Infrastructure
>> 
>> 
>> -- 
>> Ryan Blue
>> Tabular
> 

Reply via email to