Re: RFR: 8343546: GHA: Cache required dependencies in master-branch workflow [v2]

Aleksey Shipilev Mon, 07 Jul 2025 02:18:23 -0700

On Fri, 4 Jul 2025 15:30:56 GMT, Aleksey Shipilev <sh...@openjdk.org> wrote:


>> In our current GHA workflows, we only run workflows in branches in personal 
>> forks. GHA isolation rules say that workflow caches from the parent branches 
>> can be used by descendant branches. For our branches, the usual parent is 
>> `master`. Since we do not run workflows on `master`, this means every time 
>> we create a new branch, GHA would start with logically empty caches for it. 
>> Only the next trigger on the same branch would use the caches, saved from 
>> the first workflow run.
>> 
>> This means we put additional load on shared infrastructure with pulling 
>> JDKs, building jtreg (and pulling its dependencies), bootstrapping sysroots, 
>> etc. All these steps also fail intermittently every so often. It also means 
>> everyone carries lots of caches around, segregated by branch and repo (look 
>> into your https://github.com/your-github-name/jdk/actions/caches, for 
>> example) only relying on cache cleanups when it starts to hit 10 GB. With 
>> hundreds of contributors, this easily wastes terabytes of cloud storage 
>> space.
>> 
>> We can make all this more efficient and reliable, if we manage to run a 
>> master-branch workflow that bootstraps all required dependencies and caches 
>> them. These dependencies can then be used by PR branches, as "master" branch 
>> is their effective parent. 
>> 
>> This PR introduces the notion of "dry run", which does everything _except_ 
>> the actual builds and tests. Therefore, it verifies whether all dependencies 
>> are done properly for JDK configure to pass. This is useful in itself for 
>> future GHA debugging of dependencies. Workflow can be dispatched with 
>> additional "dry run" parameter now.
>> 
>> What makes master-branch caching possible is the second part of the PR that 
>> hooks up dry runs to master/stabilization branch pushes. These would make 
>> the dry-run workflow run every time you update your personal fork's 
>> master/stabilization branch. That dry run would likely finish very quickly 
>> if all caches are already in place. It would populate caches in 
>> master/stabilization branch in your personal fork, if not. 
>> 
>> The expected net result is that actual PRs that are branched off the 
>> personal fork master would be able to use the caches from that master 
>> workflow run. (If you want to make this experiment in current GHA, trigger 
>> the existing workflow on `master` branch in your fork, it would do roughly 
>> the same, but with all builds/tests).
>> 
>> A sample "dry-run" can be seen here: 
>> https://github.com/shipilev/jdk/actions/runs/16074619302. The most 
>> heavy-weight part is MSYS2 unpackin...
>
> Aleksey Shipilev has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Final touches

Yeah, see how it works.

0. I removed all caches in https://github.com/shipilev/jdk/actions/caches
1. I pulled `openjdk:master` to `shipilev:master`
2. The `master` dry-run was immediately auto-triggered and completed in ~20 
minutes, mostly stuck on sysroot creations.  I triggered manual re-runs of that 
workflow on `master` branch. The re-runs, now with full caches took ~10 
minutes, mostly driven by MSYS2 install times. 
https://github.com/shipilev/jdk/actions/runs/16111802273
3. I merged `master` into my feature branch 
[JDK-8361397-compilelog-list](https://github.com/shipilev/jdk/tree/JDK-8361397-compilelog-list),
 pushed, and it triggered the GHA run, which used the caches: 
https://github.com/shipilev/jdk/actions/runs/16112659046

So far I think it works as expected.

I also found the opportunity for MSYS2: 
[JDK-8361478](https://bugs.openjdk.org/browse/JDK-8361478).

-------------

PR Comment: https://git.openjdk.org/jdk/pull/26134#issuecomment-3044116386

Re: RFR: 8343546: GHA: Cache required dependencies in master-branch workflow [v2]

Reply via email to