I'm testing the new spark-connect distribution and here is the result: 4 packages are tested: pip install pyspark, pip install pyspark_connect (I installed them with the RC4 pyspark tarballs), the classic tarball (spark-4.0.0-bin-hadoop3.tgz), the connect tarball (spark-4.0.0-bin-hadoop3-spark-connect.tgz).
*case 1*: run command pyspark and spark-shell, with and without --master local, it should use the default mode (classic or connect depending on the distribution package) Everything works as expected. *case 2*: run command pyspark and spark-shell with --remote local, it should use the connect mode Everything works as expected. *case 3*: run command pyspark and spark-shell with --master local --conf spark.api.mode=classic, it should use the classic mode The connect packages fail with TypeError: 'JavaPackage' object is not callable when running the pyspark command. @Hyukjin Kwon <gurwls...@gmail.com> is looking into it now and will share the findings later. Please let me know if you find any other issues with RC4, either functionality issues with Spark itself, or integration issues with downstream libraries. Thanks! Wenchen On Thu, Apr 10, 2025 at 11:21 PM Wenchen Fan <cloud0...@gmail.com> wrote: > Please vote on releasing the following candidate as Apache Spark version > 4.0.0. > > The vote is open until April 15 (PST) and passes if a majority +1 PMC > votes are cast, with a minimum of 3 +1 votes. > > [ ] +1 Release this package as Apache Spark 4.0.0 > [ ] -1 Do not release this package because ... > > To learn more about Apache Spark, please see https://spark.apache.org/ > > The tag to be voted on is v4.0.0-rc4 (commit > e0801d9d8e33cd8835f3e3beed99a3588c16b776) > https://github.com/apache/spark/tree/v4.0.0-rc4 > > The release files, including signatures, digests, etc. can be found at: > https://dist.apache.org/repos/dist/dev/spark/v4.0.0-rc4-bin/ > > Signatures used for Spark RCs can be found in this file: > https://dist.apache.org/repos/dist/dev/spark/KEYS > > The staging repository for this release can be found at: > https://repository.apache.org/content/repositories/orgapachespark-1480/ > > The documentation corresponding to this release can be found at: > https://dist.apache.org/repos/dist/dev/spark/v4.0.0-rc4-docs/ > > The list of bug fixes going into 4.0.0 can be found at the following URL: > https://issues.apache.org/jira/projects/SPARK/versions/12353359 > > This release is using the release script of the tag v4.0.0-rc4. > > FAQ > > ========================= > How can I help test this release? > ========================= > > If you are a Spark user, you can help us test this release by taking > an existing Spark workload and running on this release candidate, then > reporting any regressions. > > If you're working in PySpark you can set up a virtual env and install > the current RC and see if anything important breaks, in the Java/Scala > you can add the staging repository to your projects resolvers and test > with the RC (make sure to clean up the artifact cache before/after so > you don't end up building with a out of date RC going forward). >