On 2022-May-10, at 11:49, Mark Millard <mark...@yahoo.com> wrote:

> On 2022-May-10, at 08:47, Jan Mikkelsen <j...@transactionware.com> wrote:
>> On 10 May 2022, at 10:01, Mark Millard <mark...@yahoo.com> wrote:
>>> On 2022-Apr-29, at 13:57, Mark Millard <mark...@yahoo.com> wrote:
>>>> On 2022-Apr-29, at 13:41, Pete Wright <p...@nomadlogic.org> wrote:
>>>>>> . . .
>>>>> d'oh - went out for lunch and workstation locked up.  i *knew* i 
>>>>> shouldn't have said anything lol.
>>>> Any interesting console messages ( or dmesg -a or /var/log/messages )?
>>> I've been doing some testing of a patch by tijl at FreeBSD.org
>>> and have reproduced both hang-ups (ZFS/ARC context) and kills
>>> (UFS/noARC and ZFS/ARC) for "was killed: failed to reclaim
>>> memory", both with and without the patch. This is with only a
>>> tiny fraction of the swap partition(s) enabled being put to
>>> use. So far, the testing was deliberately with
>>> vm.pageout_oom_seq=12 (the default value). My testing has been
>>> with main [so: 14].
>>> But I also learned how to avoid the hang-ups that I got --but
>>> it costs making kills more likely/quicker, other things being
>>> equal.
>>> I discovered that the hang-ups that I got were from all the
>>> processes that I interact with the system via ending up with
>>> the process's kernel threads swapped out and were not being
>>> swapped in. (including sshd, so no new ssh connections). In
>>> some contexts I only had escaping into the kernel debugger
>>> available, not even ^T would work. Other times ^T did work.
>>> So, when I'm willing to risk kills in order to maintain
>>> the ability to interact normally, I now use in
>>> /etc/sysctl.conf :
>>> vm.swap_enabled=0
>> I have been looking at an OOM related issue. Ignoring the actual leak, the 
>> problem leads to a process being killed because the system was out of 
>> memory. This is fine. After that, however, the system console was black with 
>> a single block cursor and the console keyboard was unresponsive. Caps lock 
>> and num lock didn’t toggle their lights when pressed.
>> Using an ssh session, the system looked fine. USB events for the keyboard 
>> being disconnected and reconnected appeared but the keyboard stayed 
>> unresponsive.
>> Setting vm.swap_enabled=0, as you did above, resolved this problem. After 
>> the process was killed a perfectly normal console returned.
>> The interesting thing is that this test system is configured with no swap 
>> space.
>> This is on 13.1-RC5.
>>> This disables swapping out of process kernel stacks. It
>>> is just with that option removedfor gaining free RAM, there
>>> fewer options tried before a kill is initiated. It is not a
>>> loader-time tunable but is writable, thus the
>>> /etc/sysctl.conf placement.
>> Is that really what it does? From a quick look at the code in 
>> vm/vm_swapout.c, it seems little more complex.
> I was going by its description:
> # sysctl -d vm.swap_enabled
> vm.swap_enabled: Enable entire process swapout
> Based on the below, it appears that the description
> presumes vm.swap_idle_enabled==0 (the default). In
> my context vm.swap_idle_enabled==0 . Looks like I
> should also list:
> vm.swap_idle_enabled=0
> in my /etc/sysctl.conf with a reminder comment that the
> pair of =0's are required for avoiding the observed
> hang-ups.
> The  analysis goes like . . .
> I see in the code that vm.swap_enabled !=0 causes
> void
> vm_swapout_run(void)
> {
>        if (vm_swap_enabled)
>                vm_req_vmdaemon(VM_SWAP_NORMAL);
> }
> and that in turn leads to vm_daemon to:
>                if (swapout_flags != 0) {
>                        /*
>                         * Drain the per-CPU page queue batches as a deadlock
>                         * avoidance measure.
>                         */
>                        if ((swapout_flags & VM_SWAP_NORMAL) != 0)
>                                vm_page_pqbatch_drain();
>                        swapout_procs(swapout_flags);
>                }
> Note: vm.swap_idle_enabled==0 && vm.swap_enabled==0 ends
> up with swapout_flags==0. vm.swap_idle. . . defaults seem
> to be (in my context):
> # sysctl -a | grep swap_idle
> vm.swap_idle_threshold2: 10
> vm.swap_idle_threshold1: 2
> vm.swap_idle_enabled: 0
> For reference:
> /*
> * Idle process swapout -- run once per second when pagedaemons are
> * reclaiming pages.
> */
> void
> vm_swapout_run_idle(void)
> {
>        static long lsec;
>        if (!vm_swap_idle_enabled || time_second == lsec)
>                return;
>        vm_req_vmdaemon(VM_SWAP_IDLE);
>        lsec = time_second;
> }
> [So vm.swap_idle_enabled==0 avoids VM_SWAP_IDLE status.]
> static void
> vm_req_vmdaemon(int req)
> {
>        static int lastrun = 0;
>        mtx_lock(&vm_daemon_mtx);
>        vm_pageout_req_swapout |= req;
>        if ((ticks > (lastrun + hz)) || (ticks < lastrun)) {
>                wakeup(&vm_daemon_needed);
>                lastrun = ticks;
>        }
>        mtx_unlock(&vm_daemon_mtx);
> }
> [So VM_SWAP_IDLE and VM_SWAP_NORMAL are independent bits
> in vm_pageout_req_swapout.]
> vm_deamon does:
>                mtx_lock(&vm_daemon_mtx);
>                msleep(&vm_daemon_needed, &vm_daemon_mtx, PPAUSE, "psleep",
>                    vm_daemon_timeout);
>                swapout_flags = vm_pageout_req_swapout;
>                vm_pageout_req_swapout = 0;
>                mtx_unlock(&vm_daemon_mtx);
> So vm_pageout_req_swapout is regenerated after thata
> each time.
> I'll not show the code for vm.swap_idle_enabled!=0 .

Well, with continued experiments I got an example of
a hangup for which looking via the db> prompt did not
show any swapping out of process kernel stacks
( vm.swap_enabled=0 was the context, so expected ).
The environment was ZFS (so with ARC).

But this was testing with vm.pageout_oom_seq=120 instead
of the default vm.pageout_oom_seq=12 . It may be that
let sit long enough things would have unhung (external

It is part of what I'm experimenting with so we will see.

Mark Millard
marklmi at yahoo.com

Reply via email to