Matthias Apitz <guru_at_unixarea.de> wrote on
Date: Fri, 07 Mar 2025 12:59:49 UTC :

> I'm building ports on a recent 15.0-CURRENT with ports tree from git on March 
> 3:
> 
> # uname -a
> FreeBSD jet 15.0-CURRENT FreeBSD 15.0-CURRENT #0 main-n275738-7ee310c80ea7: 
> Sun Mar 2 01:13:00 CET 2025 guru@jet:/usr/obj/usr/src/amd64.amd64/sys/GENERIC 
> amd64
> 
> # sysctl vfs.read_max=128
> # sysctl vfs.aio.max_buf_aio=8192
> # sysctl vfs.aio.max_aio_queue_per_proc=65536
> # sysctl vfs.aio.max_aio_per_proc=8192
> # sysctl vfs.aio.max_aio_queue=65536
> # sysctl vm.pageout_oom_seq=120
> # sysctl vm.pfault_oom_attempts=-1 
> 
> # poudriere bulk -f /usr/local/etc/poudriere-list -J 3 -j 150-CURRENT -p 
> ports20250303
> 
> The file /usr/local/etc/poudriere-list contains only www/chromium, because
> all other ~2400 pkg are already made.
> 
> The job gets killed with:
> ...
> ===> chromium-133.0.6943.141_1 depends on package: py311-ply>0 - found
> ===> Returning to build of chromium-133.0.6943.141_1
> ===> chromium-133.0.6943.141_1 depends on executable: bindgen - not found
> ===> Installing existing package /packages/All/rust-bindgen-cli-0.71.1_2.pkg
> [150-CURRENT-ports20250303-job-01] Installing rust-bindgen-cli-0.71.1_2...
> [150-CURRENT-ports20250303-job-01] `-- Installing llvm19-19.1.7_1...
> [150-CURRENT-ports20250303-job-01] | `-- Installing libedit-3.1.20240808,1...
> [150-CURRENT-ports20250303-job-01] | `-- Extracting libedit-3.1.20240808,1: 
> .......... done
> [150-CURRENT-ports20250303-job-01] | `-- Installing lua53-5.3.6_1...
> [150-CURRENT-ports20250303-job-01] | `-- Extracting lua53-5.3.6_1: .......... 
> done
> [150-CURRENT-ports20250303-job-01] | `-- Installing perl5-5.36.3_2...
> [150-CURRENT-ports20250303-job-01] | `-- Extracting perl5-5.36.3_2: 
> .......... done
> [150-CURRENT-ports20250303-job-01] | `-- Installing zstd-1.5.7...
> [150-CURRENT-ports20250303-job-01] | | `-- Installing liblz4-1.10.0,1...
> [150-CURRENT-ports20250303-job-01] | | `-- Extracting liblz4-1.10.0,1: 
> .......... done
> [150-CURRENT-ports20250303-job-01] | `-- Extracting zstd-1.5.7: .......... 
> done
> [150-CURRENT-ports20250303-job-01] `-- Extracting llvm19-19.1.7_1: .....Killed
> Child process pid=66963 terminated abnormally: Killed

Note that the above died while extracting from llvm19-19.1.7_1.pkg .

What are you using in /usr/local/etc/poudriere.conf for
USE_TMPFS= and for the likes of TMPFS_BLACKLIST= and
TMPFS_BLACKLIST_TMPDIR= ?

I'll note that the file system space use is vastly larger
on later stages of building. Using RAM+SWAP via TMPFS use
would have the file system competing for the RAM as well.

How many FreeBSD cpus? in /usr/local/etc/poudriere.conf
wat are you using for ALLOW_MAKE_JOBS= ? For
ALLOW_MAKE_JOBS_PACKAGES= ?

With -J3 in use, later in a bulk run there could be
questions of what the RAM+SWAP usage is like for other
jobs happening in parallel.

> # grep 66963 /var/log/messages
> Mar 7 10:24:34 jet kernel: pid 66963 (pkg-static), jid 3, uid 0, was killed: 
> failed to reclaim memory

The above message means that FreeBSD had a sustained
period of being unable to have its Free Memory
threshold met. ven a single process (thread) that
both stays runnable and causes a sufficiently large
Active Memory figure can lead to this condition.

There is a tunable for giving more time for something
to complete before that happens. For example, in
/boot/loader.conf :

vm.pageout_oom_seq=120

is 10 times larger than the default of 12.

# sysctl -Td vm.pageout_oom_seq
vm.pageout_oom_seq: back-to-back calls to oom detector to start OOM

It is also writable on a live system:

# sysctl -Wd vm.pageout_oom_seq
vm.pageout_oom_seq: back-to-back calls to oom detector to start OOM

So:

# sysctl vm.pageout_oom_seq=120
vm.pageout_oom_seq: 120 -> 120

(I use 120 over a wide variety of contexts.)

> # swapctl -l
> Device: 1024-blocks Used:
> /dev/da0p3 4194304 3144

The above added 4 GiBytes of SWAP that (mostly)
do not complete for RAM. So RAM+SWAP is effectively
about 16 GiByte + 4 GiByte == 20 GiByte.

It is not clear for the below what type of
/dev/md* is in use in each case: malloc?
vnode? swap? ( better not be: null )

Presuming vnode: quoting 
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=206048#c7

QUOTE
On 2017-Feb-13, at 7:20 PM, Konstantin Belousov <kostikbel at gmail.com> wrote
on the freebsd-arm list:

. . .

swapfile write requires the write request to come through the filesystem
write path, which might require the filesystem to allocate more memory
and read some data. E.g. it is known that any ZFS write request
allocates memory, and that write request on large UFS file might require
allocating and reading an indirect block buffer to find the block number
of the written block, if the indirect block was not yet read.

As result, swapfile swapping is more prone to the trivial and unavoidable
deadlocks where the pagedaemon thread, which produces free memory, needs
more free memory to make a progress. Swap write on the raw partition over
simple partitioning scheme directly over HBA are usually safe, while e.g.
zfs over geli over umass is the worst construction.
END QUOTE

So I'd not recommend the use of vnode /dev/md* based swap
spaces.

Nor would I recommend one of the types: malloc, swap, or null
for /dev/md* swap space creation. "man 8 mdconfig" reports:

QUOTE
       -t type
Select the type of the memory disk.

malloc Storage for this type of memory disk is allocated with
malloc(9). This limits the size to the malloc bucket
limit in the kernel. If the -o reserve option is not
set, creating and filling a large malloc-backed memory
disk is a very easy way to panic the system.

vnode A file specified with -f file becomes the backing store
for this memory disk.

swap Storage for this type of memory disk is allocated from
buffer memory. Pages get pushed out to swap when the
system is under memory pressure, otherwise they stay in
the operating memory. Using swap backing is generally
preferred instead of using malloc backing.

null Bitsink; all writes do nothing, all reads return ze-
roes.
END QUOTE

> /dev/md9 10485760 3060
> /dev/md10 10485760 3448
> /dev/md11 10485760 2924
>  
> I checked all swap devices with (example):
> 
> # dd if=/dev/md10 of=/dev/null bs=8m
> 
> all "devices" are fine.
> 
> The box has 16 GByte RAM.
> 
> What else could I check/do?


===
Mark Millard
marklmi at yahoo.com


Reply via email to