On May 16, 2021, at 09:48, Christopher Nielsen wrote:
> In terms of the ratio of vCPUs to GB of RAM, 1:1 isn’t totally unreasonable.
> However, we should also reserve 2 GB of RAM for the OS, including the disk
> cache. So perhaps 6 vCPUs would be a better choice.
MacPorts base hasn't ever considered reserving any RAM for the OS. I cannot
confirm that any changes are needed to the formula of how many jobs we start
based on how much RAM is available, but if there are, then they should be
agreed upon and made in base first; then the buildbot setup can be adjusted
accordingly.
> As for the total physical CPUs available on our Xserves, here’s the rub:
> While hyperthreading does provide some benefit, best-case it generally only
> provides 50% more headroom. And sometimes it’s as low as 25%.
I'm aware.
> So if we assume best-case, our Xserve’s only provide the processing power of
> 12 CPU cores, when accounting for hyperthreading. So even if only two
> builders are active, we’re already well overcommitted on CPU. And with three
> or more going, I’d bet the hypervisor is spending more time on scheduling and
> pre-emption, than actual processing time.
I cannot confirm or deny your claims about the hypervisor.
> By way of comparison, I’m running on a modest 2008-era MacPro, with only
> eight physical CPU cores… and no hyper threading. Plus the Xeons in my MacPro
> are one major generation behind the Nehalem-based CPUs on our Xserves. Yet,
> my port build times are anywhere from 2x to 10x faster than we’re seeing on
> our builders. (And no, that’s not an exaggeration.)
Are you looking at the time to build just the port, or are you including the
time to install all the dependencies? Because on your system, you already have
the dependencies installed, or if not, they just need to be downloaded and
installed once. On the buildbot, on the other hand, dependencies are
deactivated between builds, and are sometimes activated and deactivated many
times before the main port is built. This can result in a significant and in my
opinion an unnecessarily large amount of time being taken for dealing with
dependencies first. I believe this can be improved upon. For example, instead
of deactivating *all* active ports between each build and between each
dependency, we could develop a way to deactive only the active ports that are
not needed by the next build or dependency. This would have the added benefit
of reducing disk wear, which has already been a problem before. See
https://trac.macports.org/ticket/62621
It is definitely expected that any individual VM will take longer to build if
other VMs on the same host are also busy. Reducing the number of CPUs each VM
has will just ensure that the builds take longer, even if other VMs on the same
host are not busy.
> So we need to do something, as the buildbots simply can’t keep up.
I think they've been keeping up fine, except when a bunch of huge things need
to be built, which I find understandable.
> Upgrading them to six-core Xeons would absolutely help, for sure. But I’m
> quite certain that we could also improve the situation, by reducing the level
> of CPU overcommitment. And reducing the vCPUs per VM would help, as we simply
> don’t have the physical CPU power to support eight/VM.
When you say reduce CPU overcommitment, by what means are you referring to,
other than reducing CPUs per VM?