Hi, Amin In general, the Apache Spark community has received many feedbacks and been moving forward to
- Use the latest Hadoop versions for more bug fixes including CVEs. - Use Hadoop's shaded clients to minimize the dependency issues Since the above is not achievable with Hadoop 2 clients, I believe the official answer is `No` to (1). (Especially for your Hadoop 2.7 cluster released in 2018.) For the second question, Apache Spark community has been collaborating with Apache Hadoop community in order to use the latest Apache Hadoop 3 clients to connect old/new Hadoop clusters and public cloud environments. I believe your production jobs should be fine if you are not relying on some proprietary(=non-Apache Hadoop) features from private vendors. Please report to the Apache Hadoop community or us if you hit unknown compatibility issues. Bests Dongjoon. On Fri, Apr 8, 2022 at 9:37 PM Amin Borjian <borjianami...@outlook.com> wrote: > > > From Spark version 3.1.0 onwards, the clients provided for Spark are built > with Hadoop 3 and placed in maven repository. Unfortunately we use Hadoop > 2.7.7 in our infrastructure currently. > > > > 1) Does Spark have a plan to publish the Spark client dependencies for > Hadoop 2.x? > > 2) Are the new Spark clients capable of connecting to the Hadoop 2.x > cluster? (According to a simple test, Spark client 3.2.1 had no problem > with the Hadoop 2.7 cluster but we wanted to know if there was any > guarantee from Spark?) > > > > Thank you very much in advance > > Amin Borjian > > >