Re: swap space issues

2020-07-12 Thread Don Wilde



On 7/11/20 11:28 PM, Scott Bennett via freebsd-stable wrote:

  I have read this entire thread to date with growing dismay, and I
thank Donald Wilde for reporting his ongoing troubles, although they
spoil my hopes that the kernel's memory management bugs that first became
apparent in 11.2-RELEASE (and -STABLE around the same time) were not
propagated into 12.x.  A recent update to stable/12 source tree made it
finally possible for me to build 12.1-STABLE under 11.4-PRERELEASE, and I
was just about to install the upgrade when this thread appeared.
Spoiler alert. Since I gave up on Synth, I haven't had a single swap 
issue. It does appear to be one particular port that drove it nuts 
(apparently, one of the 'Google performance' bits, with a 
mismatched-brackets problem). I have rebuilt the machine several times, 
but that's more for my sense of tidiness than anything.


I've got a little Crystal script that walks the installed packages and 
ports and updates them with system() calls.

The machine is very slow, but it's not swapping at all.

It is quite usable now with 12-STABLE.


  On Fri, 26 Jun 2020 03:55:04 -0700 : Donald Wilde 
wrote:


On 6/26/20, Peter Jeremy  wrote:



[snip]

I strongly suggest you don't have more than one swap device on spinning
rust - the VM system will stripe I/O across the available devices and
that will give particularly poor results when it has to seek between the
partitions.

  True.  The only reason I can think of to use more than one swapping/
paging area on the same device for the same OS instance is for emergencies
or highly unusual, temporary situations in which more space is needed until
those situations conclude. and even in such situations, if the space can be
found on another device, it should be placed there.  Interleaving of swap
space across multiple devices is intended as a performance enhancement
akin to striping (a.k.a. RAID0), although the virtual memory isn't
necessarily always actually striped across those devices.  Adding a paging
area on the same device as an existing one is an abhorrent situation, as
Peter Jeremy noted, and it should be eliminated via swapoff(8) as soon as
the extraordinary situation has passed.  N.B. the GENERIC kernel sets a
limit of four swap devices, although it can be rebuilt with a different
limit.
That's good data, Scott, thanks! The only reason I got into that 
situation of trying to add another swap device was that it was crashing 
with OO swap messages.

My intent is to make this machine function -- getting the bear
dancing. How deftly she dances is less important than that she dances
at all. My for-real boxen will have real HP and real cores and RAM.


Also, you can't actually use 64GB swap with 4GB RAM.  If you look back
through your boot messages, I expect you'll find messages like:
warning: total configured swap (524288 pages) exceeds maximum recommended
amount (498848 pages).
warning: increase kern.maxswzone or reduce amount of swap.

  Also true.  Unfortunately, no guidance whatsoever is provided to advise
system administrators who need more space as to how to increase the relevant
table sizes and limits.  However, that is a documentation bug, not a code
bug.
I've got both my kern.max* and CCACHE set up mostly correctly. 
Everything builds and runs well, although I've found that it's helpful 
to only use -j3 while building, not -j4 which would be appropriate for 
my HAMMER i3. I'd much rather have the bear *dancing* than running into 
walls. :D

Yes, as I posted, those were part of the failure stream from the synth
program. When I had kern.maxswzone increased, it got through boot
without complaining.


or maybe:
WARNING: reducing swap size to maximum of MB per unit

The warnings were there, in the as-it-failed complaints.


The absolute limit on swap space is vm.swap_maxpages pages but the
realistic
limit is about half that.  By default the realistic limit is about 4?RAM
(on
64-bit architectures), but this can be adjusted via kern.maxswzone (which
defines the #bytes of RAM to allocate to swzone structures - the actual
space allocated is vm.swzone).

As a further piece of arcana, vm.pageout_oom_seq is a count that controls
the number of passes before the pageout daemon gives up and starts killing
processes when it can't free up enough RAM.  "out of swap space" messages
generally mean that this number is too low, rather than there being a
shortage of swap - particularly if your swap device is rather slow.


Thanks, Peter!

  A second round of thanks to Peter Jeremy for pointing out this sysctl
variable (vm.pageout_oom_seq), although thus far I have yet to see that it is
actually effective in working around the memory management bugs.  I have added
the following lines to /etc/sysctl.conf.

# Because FreeBSD 11.{2,3,4} tie up page frames unnecessarily, set value high
#vm.pageout_wakeup_thresh=14124 # Default value
vm.pageout_wakeup_thresh=112640 # 410 MB


[snip]

I do totally agree that these are cr

Re: swap space issues

2020-07-12 Thread Jonathan Chen
On Mon, 13 Jul 2020 at 02:24, Don Wilde  wrote:
> On 7/11/20 11:28 PM, Scott Bennett via freebsd-stable wrote:
> >   I have read this entire thread to date with growing dismay, and I
> > thank Donald Wilde for reporting his ongoing troubles, although they
> > spoil my hopes that the kernel's memory management bugs that first became
> > apparent in 11.2-RELEASE (and -STABLE around the same time) were not
> > propagated into 12.x.  A recent update to stable/12 source tree made it
> > finally possible for me to build 12.1-STABLE under 11.4-PRERELEASE, and I
> > was just about to install the upgrade when this thread appeared.
> Spoiler alert. Since I gave up on Synth, I haven't had a single swap
> issue. It does appear to be one particular port that drove it nuts
> (apparently, one of the 'Google performance' bits, with a
> mismatched-brackets problem). I have rebuilt the machine several times,
> but that's more for my sense of tidiness than anything.

With synth you can reduce the number of workers to just "1" (ie:
Number_of_builders=1), if you just want your ports-build to complete
without any stress. However, one of the reasons why I use synth is
_because_ of the stress it can place on my 12-STABLE snapshots. If the
system is stable and performs well when under load, I feel just that
bit more assured about using it in production environments.

My 2 cents.
-- 
Jonathan Chen 
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: swap space issues

2020-07-12 Thread Don Wilde



On 7/12/20 12:39 PM, Jonathan Chen wrote:
[snip]

With synth you can reduce the number of workers to just "1" (ie:
Number_of_builders=1), if you just want your ports-build to complete
without any stress. However, one of the reasons why I use synth is
_because_ of the stress it can place on my 12-STABLE snapshots. If the
system is stable and performs well when under load, I feel just that
bit more assured about using it in production environments.

My 2 cents.
Yeah, I did that. Problem was a bad update to a port, had mismatched 
bracket element so blew the stack.


Same thing happened with one worker, one task. Made sure I didn't use 
that port again... ;-)


--
Don Wilde

* What is the Internet of Things but a system  *
* of systems including humans? *


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


11.4-RELEASE i386 won't boot

2020-07-12 Thread Greg Balfour
I have an ancient Pentium machine(*) that I've been keeping up to
date using freebsd-update.  It has run everything fine up through
11.3-RELEASE-p11.  However it does not like the 11.4-RELEASE kernel.

  /boot/kernel/kernel text=0x128f22b data=0xe9748+0x2890f4
syms=[0x4+0xea3e0+0x4+0x1797e9]
  /boot/entropy size=0x1000

  Hit [Enter] to boot immediately, or any other key for command prompt.
  Booting [/boot/kernel/kernel] in 9 seconds...

  Type '?' for a list of commands, 'help' for more detailed help.
  OK set boot_verbose
  OK boot
  Booting...
  \
  int=0006  err=  efl=00010002  eip=c0ba6fa2
  eax=0001  ebx=0201ec00  ecx=  edx=c19ef18c
  esi=c19eed34  edi=c19eeaa0  ebp=c201fd08  esp=c19ee704
  cs=0008  ds=0010  es=0010fs=0010  gs=0010  ss=0010
  cs:eip=0f 45 d1 c1 e0 04 89 56-20 66 89 46 26 a1 d0 2c
 95 c1 89 46 28 5e 5d c3-90 90 90 90 90 90 55 89
  ss:esp=00 00 00 00 00 00 00 00-00 00 00 00 0c e7 9e c1
 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
  BTX halted

The old 11.3 kernel still boots fine.

  /boot/kernel.old/kernel text=0x12941cb data=0xe8e74+0x2890ec
syms=[0x4+0xe9c90+0x4+0x178d4c]
  OK boot -s
  Booting...
  Copyright (c) 1992-2019 The FreeBSD Project.
  Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
  The Regents of the University of California. All rights reserved.
  FreeBSD is a registered trademark of The FreeBSD Foundation.
  FreeBSD 11.3-RELEASE-p11 #0: Wed Jul  8 05:39:37 UTC 2020
r...@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC i386
  FreeBSD clang version 8.0.0 (tags/RELEASE_800/final 356365) (based
on LLVM 8.0.0)
  VT(vga): resolution 640x480
  CPU: Pentium/P55C (233.03-MHz 586-class CPU)
Origin="GenuineIntel"  Id=0x543  Family=0x5  Model=0x4  Stepping=3
Features=0x8001bf
  real memory  = 133169152 (127 MB)
  avail memory = 98197504 (93 MB)
  ...

The kernel file is good and there's nothing in loader.conf that
should cause a problem.

# md5 -r /boot/kernel/kernel
40f1065ab4aff80489b456386e9721c0 /boot/kernel/kernel

# cat /boot/loader.conf
console="comconsole vidconsole"
hint.acpi.0.disabled=1  # removing this doesn't help
beastie_disable="YES"

Any suggestions?

(*) I occasionally have to pull data off 5-1/4 inch floppies and this
machine is equipped to do that.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: swap space issues

2020-07-12 Thread Scott Bennett via freebsd-stable
Don Wilde  wrote:

>
> On 7/11/20 11:28 PM, Scott Bennett via freebsd-stable wrote:
> >   I have read this entire thread to date with growing dismay, and I
> > thank Donald Wilde for reporting his ongoing troubles, although they
> > spoil my hopes that the kernel's memory management bugs that first became
> > apparent in 11.2-RELEASE (and -STABLE around the same time) were not
> > propagated into 12.x.  A recent update to stable/12 source tree made it
> > finally possible for me to build 12.1-STABLE under 11.4-PRERELEASE, and I
> > was just about to install the upgrade when this thread appeared.
> Spoiler alert. Since I gave up on Synth, I haven't had a single swap 
> issue. It does appear to be one particular port that drove it nuts 
> (apparently, one of the 'Google performance' bits, with a 
> mismatched-brackets problem). I have rebuilt the machine several times, 
> but that's more for my sense of tidiness than anything.
>
> I've got a little Crystal script that walks the installed packages and 
> ports and updates them with system() calls.
> The machine is very slow, but it's not swapping at all.

 That's good.  I use portmaster, but not often at present because a
"portmaster -a" run can only be done two or three times per boot before real
memory is locked down to the extent that the system is no longer functional
(i.e., even a scrub of ZFS pools comes to a halt in mid scrub due to lack of a
sufficient supply of free page frames).
 The build procedures of certain ports consistently get killed by the OOM
killer, along with much collateral damage.  I've noticed that lang/golang and
lang/rust are prime examples now, although both used to build without problems.
>
> It is quite usable now with 12-STABLE.

 I don't see any good reason to go through the hassle and lost time of an
upgrade across a major release boundary if I still won't have a production OS
afterward.  I'm already dealing with a graphics stack rendered unsafe to use by
the ongoing churn in X11 code.  (See PR #247441, kindly filed for me by Pau
Amma.)
> >
> >   On Fri, 26 Jun 2020 03:55:04 -0700 : Donald Wilde 
> > wrote:
> >
> >> On 6/26/20, Peter Jeremy  wrote:
> >>>
> [snip]
> >>> I strongly suggest you don't have more than one swap device on spinning
> >>> rust - the VM system will stripe I/O across the available devices and
> >>> that will give particularly poor results when it has to seek between the
> >>> partitions.
> >   True.  The only reason I can think of to use more than one swapping/
> > paging area on the same device for the same OS instance is for emergencies
> > or highly unusual, temporary situations in which more space is needed until
> > those situations conclude. and even in such situations, if the space can be
> > found on another device, it should be placed there.  Interleaving of swap
> > space across multiple devices is intended as a performance enhancement
> > akin to striping (a.k.a. RAID0), although the virtual memory isn't
> > necessarily always actually striped across those devices.  Adding a paging
> > area on the same device as an existing one is an abhorrent situation, as
> > Peter Jeremy noted, and it should be eliminated via swapoff(8) as soon as
> > the extraordinary situation has passed.  N.B. the GENERIC kernel sets a
> > limit of four swap devices, although it can be rebuilt with a different
> > limit.
> That's good data, Scott, thanks! The only reason I got into that 
> situation of trying to add another swap device was that it was crashing 
> with OO swap messages.

 I don't recall you posting those messages, but it sounds like exactly the
*temporary* situation in which adding an inappropriately placed paging area can
be used long enough to get you out of a bind without a reboot, even though
performance will probably suffer until you have removed it again.  Poor
performance is usually preferable to no performance if it is only temporary.
 One cautionary note in such situations, though, applies to remote paging
areas.  Sparse files allocated on the remote system should not be used as
paging areas.  For example, I discovered the hard way (i.e., the problem was
not documented) that SunOS would crash if a sparse file via NFS were added as
a paging area and the SunOS system tried to write a page out to an unallocated
region of the file, which was essentially all of the file at first.

> >> My intent is to make this machine function -- getting the bear
> >> dancing. How deftly she dances is less important than that she dances
> >> at all. My for-real boxen will have real HP and real cores and RAM.
> >>
> >>> Also, you can't actually use 64GB swap with 4GB RAM.  If you look back
> >>> through your boot messages, I expect you'll find messages like:
> >>> warning: total configured swap (524288 pages) exceeds maximum recommended
> >>> amount (498848 pages).
> >>> warning: increase kern.maxswzone or reduce amount of swap.
> >   Also true.  Unfortunately, no guidance whatsoever