Re: [gentoo-dev] [PATCH] check-reqs.eclass: clamp MAKEOPTS for memory/RAM usage

Kai Krakow Wed, 05 Jan 2022 17:41:45 -0800

Am Mi., 5. Jan. 2022 um 21:21 Uhr schrieb Sam James <s...@gentoo.org>:
>
>> On 5 Jan 2022, at 19:18, Kai Krakow <k...@kaishome.de> wrote:
>
>>> Am Mi., 5. Jan. 2022 um 19:22 Uhr schrieb Ulrich Mueller <u...@gentoo.org>:
>
> [...]
>
>>> That applies to all parallel builds though, not only to ebuilds
>>> inheriting check-reqs.eclass. By tweaking MAKEOPTS, we're basically
>>> telling the user that the --jobs setting in their make.conf is wrong,
>>> in the first place.
>
>
>> Well, I'm using a safe combination of jobs and load-average, maybe the
>> documentation should be tweaked instead.
>
>
> I think "safe" is doing some heavy lifting here...


Well, works "safe" for me at least, but you're right.

>> I'm using
>> [...]
>
>
>> The "--jobs" parameter is mostly a safe-guard against "make" or
>> "emerge" overshooting the system resources which would happen if
>> running unconstrained without "--load-average". The latter parameter
>> OTOH tunes the parallel building processes automatically to the
>> available resources. If the system starves of memory, thus starts to
>> swap, load will increase, and make will reduce the jobs. It works
>> pretty well.
>
>> I've chosen the emerge loadavg limit slightly higher so a heavy ebuild
>> won't starve emerge from running configure phases of parallel ebuilds.
>
>
> ... because it's quite hard for this logic to work correctly enough
> of the time without jobserver integration (https://bugs.gentoo.org/692576).

Oh there's a bug report about this... I already wondered: Wouldn't it
be better if it had a global jobserver? OTOH, there are so many build
systems out there which parallelize building, and many of them won't
use a make jobserver but roll their own solution. So it looks a bit
futile on that side. That's why I've chosen the loadavg-based
approach.

> But indeed, I'd say you're not the target audience for this (but I appreciate
> the input).

Maybe not, I'm usually building in tmpfs (except huge source archives
with huge build artifacts), that means, I usually have plenty of RAM,
at least enough so it doesn't become the limiting factor.

But then again, what is the target audience? This proposal looks like
it tries to predict the future, and that's probably never going to
work right. Looking at the Github issue linked initially in the
thread, it looks like I /might/ be the target audience for packages
like qtwebkit because I'm building in tmpfs. The loadavg limiter does
quite well here unless a second huge ebuild becomes unpacked and built
in the tmpfs, at which point the system struggles to keep up and
starves from IO thrashing just to OOM portage a few moments later.
That's of course not due to the build jobs itself then, it's purely a
memory limitation. But for that reason I have configuration to build
such packages outside of tmpfs: While they usually work fine when
building just that package alone, it fails the very moment two of such
packages are built in parallel.

Maybe portage needs a job server that dynamically bumps the job
counter up or down based on current memory usage? Or "make" itself
could be patched to take that into account? But that's probably the
whole idea of the loadavg limiter. So I'd propose to at least mention
that in the documentation and examples, it seems to only be little
known.

Then again, if we run in a memory constrained system, it may be better
to parallelize ebuilds instead of build jobs to better make use of
combining light and heavy ebuild phases into the same time period.

Also, I'm not sure if 2 GB per job is the full picture - no matter if
that number is correct or isn't... Because usually the link phase of
packages like Chrome is the real RAM burner even with sane "jobs"
parameters. I've seen people failing to install these packages because
they didn't turn on swap, and then during the link phase, the compiler
took so much memory that it either froze the system for half an hour,
or OOMed. And at that stage, there's usually just this single compiler
process running (and maybe some small ones which almost use no memory
relative to that). And that doesn't get better with modern compilers
doing all sorts of global optimization stuff like LTO.

So maybe something like this could work (excluding the link phase):

If there's potentially running just one ebuild at a time (i.e. your
merge list has just one package), the effects of MAKEOPTS is quite
predictable. But if we potentially run more, we could carefully reduce
the number of jobs in MAKEOPTS before applying additional RAM
heuristics. And those heuristics probably should take the combination
of both emerge jobs and make jobs into account because potentially
that multiplies (unless 692576 is implemented).

Compiler and link flags may also be needed to take into account.

And maybe portage should take care of optionally serializing huge
packages and never build/unpack them at the same time. This would be a
huge winner for me so I would not have to manually configure things...
Something like PORTAGE_SERIALIZE_CONSTRAINED="1" to build at most one
package that has some RAM/storage warning vars in the ebuild. But
that's probably a different topic as it doesn't exactly target the
problem discussed here - and I'm also aware of this problem unlike the
target audience.


Regards,
Kai

Re: [gentoo-dev] [PATCH] check-reqs.eclass: clamp MAKEOPTS for memory/RAM usage

Reply via email to