Re: Is there a way to tell poudriere to allocate more memory to a pkg build?

Mark Millard Fri, 23 May 2025 16:50:56 -0700

On May 23, 2025, at 12:13, Dennis Clarke <dcla...@blastwave.org> wrote:


> On 5/23/25 15:00, Mark Millard wrote:
>> Dennis Clarke <dclarke_at_blastwave.org> wrote on
>> Date: Fri, 23 May 2025 17:45:17 UTC :
>>> I have been watching qt6-webengine-6.8.3 fail over and over and over
>>> for some days now and it takes with it a pile of other stuff.
>>> 
>>> In the log I see this unscripted trash of a message :
>>> 
>>> [00:05:03] FAILED: v8_context_snapshot.bin
>>> [00:05:03] /usr/local/bin/python3.11
>>> ../../../../../qtwebengine-everywhere-src-6.8.3/src/3rdparty/chromium/build/gn_run_binary.p
>>> y ./v8_context_snapshot_generator --output_file=v8_context_snapshot.bin
>>> [00:05:03]
>>> [00:05:03]
>>> [00:05:03] #
>>> [00:05:03] # Fatal error in , line 0
>>> [00:05:03] # Oilpan: Out of memory
>>> [00:05:03] #
>>> [00:05:03] #
>> Way to little context so all I can do is basically form
>> questions at this point.
> 
> Sorry ... I just realized that other people replied to me OFF-LIST and
> that is not helpful to others.
> 
> So the machine titan is fairly beefy :
> 
> titan#
> titan# uname -apKU
> FreeBSD titan 15.0-CURRENT FreeBSD 15.0-CURRENT #1 main-n277353-19419d36cf2a: 
> Mon May 19 20:40:28 UTC 2025 
> root@titan:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 amd64 1500043 
> 1500043
> titan#
> titan# sysctl hw.model
> hw.model: Intel(R) Xeon(R) CPU E5-2697A v4 @ 2.60GHz
> titan#
> titan# sysctl hw.ncpu
> hw.ncpu: 64
> titan#
> titan# sysctl hw.physmem
> hw.physmem: 549598998528
> titan#
> titan# sysctl hw.freemem
> sysctl: unknown oid 'hw.freemem'
> titan#
> titan# sysctl kstat.zfs.misc.arcstats.memory_free_bytes
> kstat.zfs.misc.arcstats.memory_free_bytes: 404796436480
> titan# sysctl vm.kmem_map_free
> vm.kmem_map_free: 431405096960
> titan#
> 
> Also plenty of storage and local NVMe stuff etc etc and dual
> NVidia GPU's that do nothing at all.  For now.
> 
> We ( myself and others ) have already found that the problem was
> me. No big surprise.
> 
> USE_TMPFS=yes
> TMPFS_LIMIT=32
> MAX_MEMORY=32
> # MAX_FILES=1024
> MAX_EXECUTION_TIME=172800
> PARALLEL_JOBS=64
> PREPARE_PARALLEL_JOBS=64
> 
> That was the problem in the poudriere config.
> 
> I commented out the MAX_MEMORY and TMPFS_LIMIT and then watched
> as www/qt6-webengine built just fine. Guess the jail needed more
> than 32G eh?
> 
>> I assume that you have not explicitly restricted the memory
>> space for any processes, so that RAM+SWAP is fully available
>> to everything. If not, you need to report on the details.
>> 
> 
> Yup .. I had restrictions in place. Those very very few packages
> are hogs. Just massive running pigs for memory it seems.
> 
> 
>> How much RAM? How much SWAP space? (So: how much RAM+SWAP?)
>> (RAM+SWAP does not vary per process tree or per builder,
>> presuming no deliberate restrictions have been placed.)
> 
> 512G mem and 32G swap which never gets touched.

So RAM+SWAP == 544 GiBytes
544 GiBytes / 64 jobs == 8.5 GiBytes/job [mean]

Some builders will exceed that, for sure. But
you are really managing the total across the
64 jobs (adjusting the jobs count would be
a possibility).


I do things like:


For 32 GiBytes of RAM: RAM+SWAP == 150 GiBytes
(so: RAM+SWAP == 4.6875*RAM)

So for 8 hw threads (and, so, 8 jobs):
150 GiBy†es / 8 jobs == 18.75 GiBytes/job [mean]


For 64 GiBytes of RAM: RAM+SWAP == 300 GiBytes
(so: RAM+SWAP == 4.6875*RAM)

So for 12 performance hw threads (and, so, 12 jobs):
300 GiBy†es / 12 jobs == 25 GiBytes / jobs [mean]

But for 16 hw threads (and, so, 16 jobs):
300 GiBy†es / 16 jobs == 18.75 GiBytes / job [mean]


For 192 GiBytes of RAM: RAM+SWAP == 704 GiBytes
(so: RAM+SWAP == 3.66...*RAM, I had other reasons
to constraint this context more)

So for 32 hw threads (and, so, 32 jobs):
704 GiBytes / 32 jobs == 22 GiBytes / job [mean]


But I also use TMPFS_BLACKLIST extensively with
USE_TMPFS=all to avoid having much probability of
running out of resources. (This actually uses less
than All but does not eliminate tmpfs use from
being involved for the blacklisted package builds.)

This is for use of ALLOW_MAKE_JOBS=yes (so a high
load average context much of the time). I do end up
with examples of SWAP being used for paging.

In a couple of those hw contexts, I, on very rare
occasion, do a 'bulk -c -a' run (as a test), also
with ALLOW_MAKE_JOBS=yes .

(I generally do not use MAKE_JOBS_NUMBER_LIMIT
or the like.)

A bias is to keep the hw threads busy with
already-available useful work to do without
having to systematically wait for the next unit
of work. (Of course, other tradeoffs are also
managed.)


Note: My original questions should have asked about
ALLOW_MAKE_JOBS and the likes of MAKE_JOBS_NUMBER_LIMIT
use as well. (Not that such matters now.)

>> Do you even have "whatever it seems to want" configured
>> for the RAM+SWAP? (I'm guessing that you do not know that
>> the "128G" figure is in fact involved.)
>> 
> 
> I commented out those restrictions. Makes me worry that some other
> packages will come along and fail because they need 384G of mem or
> something silly like that.

I'd be more worried about multiple jobs that each
use more than 8.5 GiBytes that happen to lead to
the total being more than (RAM+SWAP) [544 GiBytes].

Side note:
I'll note that at one point in a past 'bulk -c -a'
run test, a process got a run-way error that rapidly
wrote to the log file without bound. Not good if the
log file is in tmpfs. But such has happened only with
one vintage of the ports tree. It is the only context
I've seen that would reach or pass 384 GiBytes for
one builder.

> I have been advised ( in the last hour )
> that chromium ports generate +40000 source files and such. That is
> just abusive but the way of the future I am sure.

rust with USE_TMPFS=all and not in TMPFS_BLACKLIST
can use 27+ GiBytes of RAM+SWAP. And it is not the
largest example around.

>> How many parallel builders are active in the bulk run
>> as the bulk build approaches the failure?
> 
> I think 64 max.
> 
>> How much RAM+SWAP in use by other builders or other things
>> on the system as the system progresses to that failure (not
>> after the failure)?
> 
> ....
> 
> *sigh*
> 
> The problem was me.
> 
>> ZFS (and its ARC)? UFS? If ZFS: Any tuning?
>> 
> 
> No tuning. It just works(tm) and that is ZFS.

Most of my use is UFS based. Only the machine with
the most hw threads and RAM is ZFS based these days.

(Part of that is to be a UFS test case since its use
for such activities is less common: At one time, I
noticed and reported that 'bulk -c -a' was at
the time broken under UFS because of reaching a UFS
limitation that was later adjusted.)

>> Basically: all the significant sources of competing for
>> RAM+SWAP?
>> .... ===
>> Mark Millard
>> marklmi at yahoo.com
> 
> 
> It feels like the correct approach is just give everything to the
> poudriere bulk situation and then watch for flames.
> 
> No flames? No smoke? Great .. it is working.

I did experimentation with 'bulk -c -a' to arrive at
configuration choices that would survive such and
left sizable margins for handling growth without
regular adjustments. So my flame tests were more
up front.



===
Mark Millard
marklmi at yahoo.com

Re: Is there a way to tell poudriere to allocate more memory to a pkg build?

Reply via email to