Hi,

Le 2024-12-26 20:57, Michael Stone a écrit :

As I suggested: you need two tools or one new tool because what you're looking for is the min of ncpus and (available_mem / process_size). The result of that calculation is not the "number of cpus", it is the number of processes you want to run.

This is definitely true. "nproc" could potentially be repurposed to mean "number of processes" though.

Here's the problem: the definition of "available memory" is very vague. `free -hwv` output from a random machine:

total used free shared buffers cache available Mem: 30Gi 6.7Gi 2.4Gi 560Mi 594Mi 21Gi 23Gi
Swap:           11Gi       2.5Mi        11Gi
Comm:           27Gi        22Gi       4.3Gi

Is the amount of available memory 2.4Gi, 23Gi, maybe 23+11Gi? Or 4.3Gi?
IMO, there is no good answer to that question.

I would rather argue that there is no perfect answer to that question, but that the 23GiB in the "Available" column are good enough for most use cases including building stuff, IF (and only if) you take into account that you can't have all of it committed by processes as you still need a decent amount of cache and buffers (how much? very good question thank you) for that build to run smoothly and efficiently. Swap should be ignored for all practical purposes here.

(or else, what's wrong with using /proc/meminfo directly?)

I haven't looked at how packages currently try to compute potential parallelism using data from /proc/meminfo, but my own experience with Java stuff and otherwise perfectly competent, highly qualified engineers getting available RAM computation wrong makes me not too optimistic about the overall accuracy of these guesses.

E.g. a few hours ago
I fear your rebuild is ooming workers (...) it seems that some package is reducing is parallelism to two c++ compilers and that still exceeds 20G

Providing a simple tool that standardizes the calculation and documenting examples and guidelines is certainly going to help here. It will also move the logic to collect, parse and compute the result to a single place, reducing logic duplication and maintainance burden across packages.

You'd need to somehow get people to define policies, what would that look like?

I would suggest making it possible to input the overall marginal RAM requirements per parallelized process. That is, the amount of additional "available RAM" needed for every additional process. As that value is very probably going to be larger for the first processes, and as this fact matters more on constrained environments (e.g. containers, busy CI runners etc), making it possible to sort of define a curve (e.g. 8 GiB - 5 GiB - 2 GiB - 2 GiB ... => 7 workers with 23 GiB available RAM) will allow a closer match to the constraints of these environments.

In addition, providing an option to limit the computed result to the number of available actual cpu cores (not vcpus/threads) and another one to place an arbitrary upper limit of process beyond which no gains are expected would be nice.

Cheers,

--
Julien Plissonneau Duquène

Reply via email to