On Thu, Feb 14, 2019 at 06:56:42PM +1100, Bruce Evans wrote: > On Wed, 13 Feb 2019, Justin Hibbits wrote: > > > On Tue, 15 Jan 2019 01:02:17 +0000 (UTC) > > Gleb Smirnoff <gleb...@freebsd.org> wrote: > > > >> Author: glebius > >> Date: Tue Jan 15 01:02:16 2019 > >> New Revision: 343030 > >> URL: https://svnweb.freebsd.org/changeset/base/343030 > >> > >> Log: > >> Allocate pager bufs from UMA instead of 80-ish mutex protected > >> linked list. > > ... > > > > This seems to break 32-bit platforms, or at least 32-bit book-e > > powerpc, which has a limited KVA space (~500MB). It preallocates I've > > seen over 2500 pbufs, at 128kB each, eating up over 300MB KVA, > > leaving very little left for the rest of runtime. > > Hrmph. I complained other things in this commit this when it was > committed, but not this largest bug since preallocation was broken then > so I thought that it wasn't done, so that problems are smaller unless the > excessive limits are actually reached. > > Now i386 does it: > > XX ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP > XX > XX swrbuf: 336, 128, 0, 0, 0, 0, 0 > XX swwbuf: 336, 64, 0, 0, 0, 0, 0 > XX nfspbuf: 336, 128, 0, 0, 0, 0, 0 > XX mdpbuf: 336, 25, 0, 0, 0, 0, 0 > XX clpbuf: 336, 128, 0, 5, 4, 0, 0 > XX vnpbuf: 336, 2048, 0, 0, 0, 0, 0 > XX pbuf: 336, 16, 0, 2535, 0, 0, 0 > > but i386 now has 4GB of KVA, with almost 3GB to waste, so the bug is not > noticed there. > > The preallocation wasn't there in my last mail to the author about nearby > bugs, on 24 Jan 2019: > > YY vnpbuf: 568, 2048, 0, 0, 0, 0, 0 > YY clpbuf: 568, 128, 0, 128, 8750, 0, 1 > YY pbuf: 568, 16, 0, 4, 0, 0, 0 > > This output is on amd64 where the SIZE is larger and everything else was > the same as on i386. Now amd64 shows the large preallocation too. > > There seems to be another bug for the especially small LIMIT of 16 to > turn into a preallocation of 2535 and not cause immediate reduction to > the limit. > > I happen to have kernels from 24 and 25 Jan handy. The first one is > amd64 r343346M built on Jan 23, and it doesn't do the large > preallocation. The second one is i386 r343388:343418M built on Jan > 25, and it does the large preallocation. Both call uma_prealloc() to > ask for nswbuf_max = 0x9e9 buffers, but the old version only allocates > 4 buffers while later version allocate 0x9e9 buffers. > > The only relevant commit between the good and bad versions seems to be > r343453. This fixes uma_prealloc() to actually work. But it is a feature > for it to not work when its caller asks for too much.
I guess you meant r343353. In any case, the pbuf keg is _NOFREE, so even without preallocation the large pbuf zone limits may become problematic if there are bursts of allocation requests. > 0x9e9 is the sum of the LIMITs of all pbuf pools. The main bug in > r343030 is that it expands nswbuf, which is supposed to give the > combined limit, from its normal value of 256 to 0x9e9. (r343030 > actually used nswbuf before it was properly initialized, so used its > maximum value of 256 even on small systems with nswbuf = 16. Only > this has been fixed.) > > On i386, nbuf is excessively limited so as to give a maxbufspace of > about 100MB so as to fit in 1GB of kva even with infinite RAM and > -current's actual 4GB of kva. nbuf is correctly limited to give a > much smaller maxbufspace when RAM is small (kva scaling for this is > not done so well). nswbuf is restricted if nbuf is restricted, but > not enough (except in my version). It is normally 256, so the pbuf > allocation used to be 32MB, and this is already a bit large compared > with 100MB for maxbufspace. Expanding pbufs by a factor of 0x9e9/0x100 > gives the silly combination of 100MB for maxbufspace and 317MB for > pbufs. > > If kva is only 512MB instead of 1GB, then maxbufspace should be only > 50MB and nswbuf should be smaller too. Similarly for PAE on i386 back > when it was configured with 1GB kva by default. Only about 512MB are > left after allocating space for page table metadata. I have fixes > that scale most of this better. Large subsystems starting with kmem > get a hard-coded fraction of the usable kva. E.g., kmem gets about > 60% of usable kva instead of about 40% of nominal kva. Most other > large subsystems including the buffer cache get about 1/8 of the > remaining 40% of usable kva. Scaling for other subsystems is mostly > worse than for kmem. pbufs are part of the buffer cache allocation. > The expansion factor of 0x9e9/0x100 breaks this. > > I don't understand how pbuf_preallocate() allocates for the other > pbuf pools. When I debugged this for clpbufs, the preallocation was > not used. pbuf types other than clpbufs seem to be unused in my > configurations. I thought that pbufs were used during initialization, > since they end up with a nonzero FREE count, but their only use seems > to be to preallocate them. All of the pbuf zones share a common slab allocator. The zones have individual limits but can tap in to the shared preallocation. _______________________________________________ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"