Hi,
Le 2024-12-26 20:57, Michael Stone a écrit :
As I suggested: you need two tools or one new tool because what you're
looking for is the min of ncpus and (available_mem / process_size). The
result of that calculation is not the "number of cpus", it is the
number of processes you want to run.
This is definitely true. "nproc" could potentially be repurposed to mean
"number of processes" though.
Here's the problem: the definition of "available memory" is very vague.
`free -hwv` output from a random machine:
total used free shared buffers
cache available
Mem: 30Gi 6.7Gi 2.4Gi 560Mi 594Mi
21Gi 23Gi
Swap: 11Gi 2.5Mi 11Gi
Comm: 27Gi 22Gi 4.3Gi
Is the amount of available memory 2.4Gi, 23Gi, maybe 23+11Gi? Or 4.3Gi?
IMO, there is no good answer to that question.
I would rather argue that there is no perfect answer to that question,
but that the 23GiB in the "Available" column are good enough for most
use cases including building stuff, IF (and only if) you take into
account that you can't have all of it committed by processes as you
still need a decent amount of cache and buffers (how much? very good
question thank you) for that build to run smoothly and efficiently. Swap
should be ignored for all practical purposes here.
(or else, what's wrong with using /proc/meminfo directly?)
I haven't looked at how packages currently try to compute potential
parallelism using data from /proc/meminfo, but my own experience with
Java stuff and otherwise perfectly competent, highly qualified engineers
getting available RAM computation wrong makes me not too optimistic
about the overall accuracy of these guesses.
E.g. a few hours ago
I fear your rebuild is ooming workers (...) it seems that some package
is reducing is parallelism to two c++ compilers and that still exceeds
20G
Providing a simple tool that standardizes the calculation and
documenting examples and guidelines is certainly going to help here. It
will also move the logic to collect, parse and compute the result to a
single place, reducing logic duplication and maintainance burden across
packages.
You'd need to somehow get people to define policies, what would that
look like?
I would suggest making it possible to input the overall marginal RAM
requirements per parallelized process. That is, the amount of additional
"available RAM" needed for every additional process. As that value is
very probably going to be larger for the first processes, and as this
fact matters more on constrained environments (e.g. containers, busy CI
runners etc), making it possible to sort of define a curve (e.g. 8 GiB -
5 GiB - 2 GiB - 2 GiB ... => 7 workers with 23 GiB available RAM) will
allow a closer match to the constraints of these environments.
In addition, providing an option to limit the computed result to the
number of available actual cpu cores (not vcpus/threads) and another one
to place an arbitrary upper limit of process beyond which no gains are
expected would be nice.
Cheers,
--
Julien Plissonneau Duquène