On Mon, Feb 18, 2019 at 9:32 AM Daniel Stone <dan...@fooishbar.org> wrote: > > Hi all, > A few people have noted that Mesa's GitLab CI is just too slow, and > not usable in day-to-day development, which is a massive shame. > > I looked into it a bit this morning, and also discussed it with Emil, > though nothing in this is speaking for him. > > Taking one of the last runs as representative (nothing in it looks > like an outlier to me, and 7min to build RadeonSI seems entirely > reasonable): > https://gitlab.freedesktop.org/mesa/mesa/pipelines/19692/builds > > This run executed 24 jobs, which is beyond the limit of our CI > parallelism. As documented on > https://www.freedesktop.org/wiki/Infrastructure/ we have 14 concurrent > job slots (each with roughly 4 vCPUs). Those 24 jobs cumulatively took > 177 minutes of execution time, taking 120 minutes for the end-to-end > pipeline. > > 177 minutes of runtime is too long for the runners we have now: if it > perfectly occupies all our runners it will take over 12 minutes, which > means that even if no-one else was using the runners, they could > execute 5 Mesa builds per hour at full occupancy. Unfortunately, > VirGL, Wayland/Weston, libinput, X.Org, IGT, GStreamer, > NetworkManager/ModemManager, Bolt, Poppler, etc, would all probably > have something to say about that. > > When the runners aren't occupied and there's less contention for jobs, > it looks quite good: > https://gitlab.freedesktop.org/anholt/mesa/pipelines/19621/builds > > This run 'only' took 20.5 minutes to execute, but then again, 3 > pipelines per hour isn't that great either. > > Two hours of end-to-end pipeline time is also obviously far too long. > Amongst other things, it practically precludes pre-merge CI: by the > time your build has finished, someone will have pushed to the tree, so > you need to start again. Even if we serialised it through a bot, that > would limit us to pushing 12 changesets per day, which seems too low. > > I'm currently talking to two different hosts to try to get more > sponsored time for CI runners. Those are both on hold this week due to > travel / personal circumstances, but I'll hopefully find out more next > week. Eric E filed an issue > (https://gitlab.freedesktop.org/freedesktop/freedesktop/issues/120) to > enable ccache cache but I don't see myself having the time to do it > before next month. > > In the meantime, it would be great to see how we could reduce the > number of jobs Mesa runs for each pipeline. Given we're already > exceeding the limits of parallelism, having so many independent jobs > isn't reducing the end-to-end pipeline time, but instead just > duplicating effort required to fetch and check out sources, cache (in > the future), start the container, run meson or ./configure, and build > any common files. > > I'm taking it as a given that at least three separate builds are > required: autotools, Meson, and SCons. Fair enough. > > It's been suggested to me that SWR should remain separate, as it takes > longer to build than the other drivers, and getting fast feedback is > important, which is fair enough. > > Suggestion #1: merge scons-swr into scons-llvm. scons-nollvm will > already provide fast feedback on if we've broken the SCons build, and > the rest is pretty uninteresting, so merging scons-swr into scons-llvm > might help cut down on duplication. > > Suggestion #2: merge the misc Gallium jobs together. Building > gallium-radeonsi and gallium-st-other are both relatively quick. We > could merge these into gallium-drivers-other for a very small increase > in overall runtime for that job, and save ourselves probably about 10% > of the overall build time here. > > Suggestion #3: don't build so much LLVM in autotools. The Meson > clover-llvm builds take half the time the autotools builds do. Perhaps > we should only build one LLVM variant within autotools (to test the > autotools LLVM selection still works), and then build all the rest > only in Meson. That would be good for another 15-20% reduction in > overall pipeline run time. > > Suggestion #4 (if necessary): build SWR less frequently. Can we > perhaps demote SWR to an 'only:' job which will only rebuild SWR if > SWR itself or Gallium have changed? This would save a good chunk of > runtime - again close to 10%. > > Doing the above would reduce the run time fairly substantially, for > what I can tell is no loss in functional coverage, and bring the > parallelism to a mere 1.5x oversubscription of the whole > organisation's available job slots, from the current 2x. > > Any thoughts?
All of your suggestions seem reasonable. Removing autotools [1] would obviously reduce the number of builds. If I understood correctly, we are kicking off a CI run for every push to a fork of the Mesa repo, and not just for merge requests. I think that's absolutely the wrong thing to do. CI for personal branches should be opt-in. [1] https://bugs.freedesktop.org/show_bug.cgi?id=mesa-autotools-removal _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev