On Thu, Aug 17, 2023 at 07:41:50AM -0600, Simon Glass wrote: > Hi Tom, > > On Tue, 15 Aug 2023 at 08:56, Tom Rini <tr...@konsulko.com> wrote: > > > > On Tue, Aug 15, 2023 at 08:44:20AM -0600, Simon Glass wrote: > > > Hi Tom, > > > > > > On Sun, 13 Aug 2023 at 09:52, Tom Rini <tr...@konsulko.com> wrote: > > > > > > > > On Sat, Aug 12, 2023 at 09:14:45PM -0600, Simon Glass wrote: > > > > > > > > > Hi Tom, > > > > > > > > > > I notice that the runners are not utilised much by the QEMU jobs, > > > > > since we only run one at a time. > > > > > > > > > > I wonder if we could improve this, perhaps by using a different tag > > > > > for the QEMU ones and then having a machine that only runs those (and > > > > > runs 40 in parallel)? > > > > > > > > > > In general our use of the runners seems a bit primitive, since the > > > > > main use of parallelism is in the world builds. > > > > > > > > I'm honestly not sure. I think there's a few tweaks that we should do, > > > > like putting the opensbi and coreboot files in to the Dockerfile logic > > > > instead. And maybe seeing if just like we can have a docker registry > > > > cache, if we can setup local pypi cache too? I'm not otherwise sure > > > > what's taking 23 seconds or so of > > > > https://source.denx.de/u-boot/u-boot/-/jobs/673565#L34 since the build > > > > and run parts aren't much. > > > > > > > > My first big worry about running 2 or 3 qemu jobs at the same time on a > > > > host is that any wins get from a shorter queue will be lost to buildman > > > > doing "make -j$(nproc)" 2 or 3 times at once and so we build slower. > > > > > > Yes, perhaps. > > > > > > > > > > > My second big worry is that getting the right tags on runners will be a > > > > little tricky. > > > > > > Yes, and error-prone. Also it makes it harder to deal with broken > > > machines. > > > > > > > > > > > My third big worry (but this is something you can test easy enough at > > > > least) is that running the big sandbox tests, 2 or 3 times at once on > > > > the same host will get much slower. I think, but profiling would be > > > > helpful, that those get slow due to I/O and not CPU. > > > > > > I suspect it would be fast enough. > > > > > > But actually the other problem is that I am not sure whether the jobs > > > would have their own filesystem? > > > > Yes, they should be properly sandboxed. If you want to test some of > > these ideas, I think the best path is to just un-register temproarily > > (comment out the token in config.toml) some of your runners and then > > register them with just the DM tree and experiment. > > OK thanks for the idea. I tried this on tui > > I used a 'concurrent = 10' and it got up to a load of 70 or so every > now and then, but mostly it was much less. > > The whole run (of just the test.py stage) took 8 minutes, with > 'sandbox with clang test' taking the longest. > > I'm not too sure what that tells us...
Well, looking at https://source.denx.de/u-boot/u-boot/-/pipelines/17391/builds the whole run took 56 minutes, of which 46 minutes was on 32bit ARM world build. And the longest test.py stage was sandbox without LTO at just under 8 minutes. So I think trying to get more concurrency in this stage is likely to be a wash in terms of overall CI run time. -- Tom
signature.asc
Description: PGP signature