Hi all, I am trying to revive this thread - to work towards a better release process, and making sure we have no conflicts in the used artifacts like nicholas.cham...@gmail.com mentioned. @Wenchen Fan <cloud0...@gmail.com> - can you please clarify - you state that the release scripts are using a different build and Docker than Github Actions. The release scripts are releasing the artifacts that are actually being used... What are the other ones which are created by Github Actios today used for? Only testing?
Me personally - I believe that "release is king" - meaning what actually is being used by all the users is the "correct" build and we should align ourselves to it. What do you think are the needed next steps for us to take in order to make the release process fully automated and simple? Thanks, Nimrod On Mon, May 13, 2024 at 2:31 PM Wenchen Fan <cloud0...@gmail.com> wrote: > Hi Nicholas, > > Thanks for your help! I'm definitely interested in participating in this > unification work. Let me know how I can help. > > Wenchen > > On Mon, May 13, 2024 at 1:41 PM Nicholas Chammas < > nicholas.cham...@gmail.com> wrote: > >> Re: unification >> >> We also have a long-standing problem with how we manage Python >> dependencies, something I’ve tried (unsuccessfully >> <https://github.com/apache/spark/pull/27928>) to fix in the past. >> >> Consider, for example, how many separate places this numpy dependency is >> installed: >> >> 1. >> https://github.com/apache/spark/blob/9a2818820f11f9bdcc042f4ab80850918911c68c/.github/workflows/build_and_test.yml#L277 >> 2. >> https://github.com/apache/spark/blob/9a2818820f11f9bdcc042f4ab80850918911c68c/.github/workflows/build_and_test.yml#L733 >> 3. >> https://github.com/apache/spark/blob/9a2818820f11f9bdcc042f4ab80850918911c68c/.github/workflows/build_and_test.yml#L853 >> 4. >> https://github.com/apache/spark/blob/9a2818820f11f9bdcc042f4ab80850918911c68c/.github/workflows/build_and_test.yml#L871 >> 5. >> https://github.com/apache/spark/blob/8094535973f19e9f0543535a97254e8ebffc1b23/.github/workflows/build_python_connect35.yml#L70 >> 6. >> https://github.com/apache/spark/blob/553e1b85c42a60c082d33f7b9df53b0495893286/.github/workflows/maven_test.yml#L181 >> 7. >> https://github.com/apache/spark/blob/6e5d1db9058de62a45f35d3f41e028a72f688b70/dev/requirements.txt#L5 >> 8. >> https://github.com/apache/spark/blob/678aeb7ef7086bd962df7ac6d1c5f39151a0515b/dev/run-pip-tests#L90 >> 9. >> https://github.com/apache/spark/blob/678aeb7ef7086bd962df7ac6d1c5f39151a0515b/dev/run-pip-tests#L99 >> 10. >> https://github.com/apache/spark/blob/9a2818820f11f9bdcc042f4ab80850918911c68c/dev/create-release/spark-rm/Dockerfile#L40 >> 11. >> https://github.com/apache/spark/blob/9a42610d5ad8ae0ded92fb68c7617861cfe975e1/dev/infra/Dockerfile#L89 >> 12. >> https://github.com/apache/spark/blob/9a42610d5ad8ae0ded92fb68c7617861cfe975e1/dev/infra/Dockerfile#L92 >> >> None of those installations reference a unified version requirement, so >> naturally they are inconsistent across all these different lines. Some say >> `>=1.21`, others say `>=1.20.0`, and still others say `==1.20.3`. In >> several cases there is no version requirement specified at all. >> >> I’m interested in trying again to fix this problem, but it needs to be in >> collaboration with a committer since I cannot fully test the release >> scripts. (This testing gap is what doomed my last attempt at fixing this >> problem.) >> >> Nick >> >> >> On May 13, 2024, at 12:18 AM, Wenchen Fan <cloud0...@gmail.com> wrote: >> >> After finishing the 4.0.0-preview1 RC1, I have more experience with this >> topic now. >> >> In fact, the main job of the release process: building packages and >> documents, is tested in Github Action jobs. However, the way we test them >> is different from what we do in the release scripts. >> >> 1. the execution environment is different: >> The release scripts define the execution environment with this >> Dockerfile: >> https://github.com/apache/spark/blob/master/dev/create-release/spark-rm/Dockerfile >> However, Github Action jobs use a different Dockerfile: >> https://github.com/apache/spark/blob/master/dev/infra/Dockerfile >> We should figure out a way to unify it. The docker image for the release >> process needs to set up more things so it may not be viable to use a single >> Dockerfile for both. >> >> 2. the execution code is different. Use building documents as an example: >> The release scripts: >> https://github.com/apache/spark/blob/master/dev/create-release/release-build.sh#L404-L411 >> The Github Action job: >> https://github.com/apache/spark/blob/master/.github/workflows/build_and_test.yml#L883-L895 >> I don't know which one is more correct, but we should definitely unify >> them. >> >> It's better if we can run the release scripts as Github Action jobs, but >> I think it's more important to do the unification now. >> >> Thanks, >> Wenchen >> >> >> On Fri, May 10, 2024 at 12:34 AM Hussein Awala <huss...@awala.fr> wrote: >> >>> Hello, >>> >>> I can answer some of your common questions with other Apache projects. >>> >>> > Who currently has permissions for Github actions? Is there a specific >>> owner for that today or a different volunteer each time? >>> >>> The Apache organization owns Github Actions, and committers >>> (contributors with write permissions) can retrigger/cancel a Github Actions >>> workflow, but Github Actions runners are managed by the Apache infra team. >>> >>> > What are the current limits of GitHub Actions, who set them - and what >>> is the process to change those (if possible at all, but I presume not all >>> Apache projects have the same limits)? >>> >>> For limits, I don't think there is any significant limit, especially >>> since the Apache organization has 900 donated runners used by its projects, >>> and there is an initiative from the Infra team to add self-hosted runners >>> running on Kubernetes (document >>> <https://cwiki.apache.org/confluence/display/INFRA/ASF+Infra+provided+self-hosted+runners> >>> ). >>> >>> > Where should the artifacts be stored? >>> >>> Usually, we use Maven for jars, DockerHub for Docker images, and Github >>> cache for workflow cache. But we can use Github artifacts to store any kind >>> of package (even Docker images in the ghcr), which is fully accepted by >>> Apache policies. Also if the project has a cloud account (AWS, GCP, Azure, >>> ...), a bucket can be used to store some of the packages. >>> >>> >>> > Who should be permitted to sign a version - and what is the process >>> for that? >>> >>> The Apache documentation is clear about this, by default only PMC >>> members can be release managers, but we can contact the infra team to add >>> one of the committers as a release manager (document >>> <https://infra.apache.org/release-publishing.html#releasemanager>). The >>> process of creating a new version is described in this document >>> <https://www.apache.org/legal/release-policy.html#policy>. >>> >>> >>> On Thu, May 9, 2024 at 10:45 AM Nimrod Ofek <ofek.nim...@gmail.com> >>> wrote: >>> >>>> Following the conversation started with Spark 4.0.0 release, this is a >>>> thread to discuss improvements to our release processes. >>>> >>>> I'll Start by raising some questions that probably should have answers >>>> to start the discussion: >>>> >>>> >>>> 1. What is currently running in GitHub Actions? >>>> 2. Who currently has permissions for Github actions? Is there a >>>> specific owner for that today or a different volunteer each time? >>>> 3. What are the current limits of GitHub Actions, who set them - >>>> and what is the process to change those (if possible at all, but I >>>> presume >>>> not all Apache projects have the same limits)? >>>> 4. What versions should we support as an output for the build? >>>> 5. Where should the artifacts be stored? >>>> 6. What should be the output? only tar or also a docker image >>>> published somewhere? >>>> 7. Do we want to have a release on fixed dates or a manual release >>>> upon request? >>>> 8. Who should be permitted to sign a version - and what is the >>>> process for that? >>>> >>>> >>>> Thanks! >>>> Nimrod >>>> >>> >>