On Thu, 14 Feb 2019, Mark Johnston wrote:
On Thu, Feb 14, 2019 at 06:56:42PM +1100, Bruce Evans wrote:
* ...
The only relevant commit between the good and bad versions seems to be
r343453. This fixes uma_prealloc() to actually work. But it is a feature
for it to not work when its caller asks for too much.
I guess you meant r343353. In any case, the pbuf keg is _NOFREE, so
even without preallocation the large pbuf zone limits may become
problematic if there are bursts of allocation requests.
Oops.
* ...
I don't understand how pbuf_preallocate() allocates for the other
pbuf pools. When I debugged this for clpbufs, the preallocation was
not used. pbuf types other than clpbufs seem to be unused in my
configurations. I thought that pbufs were used during initialization,
since they end up with a nonzero FREE count, but their only use seems
to be to preallocate them.
All of the pbuf zones share a common slab allocator. The zones have
individual limits but can tap in to the shared preallocation.
It seems to be working as intended now (except the allocation count is
3 higher than expected):
XX ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP
XX
XX swrbuf: 336, 128, 0, 0, 0, 0, 0
XX swwbuf: 336, 64, 0, 0, 0, 0, 0
XX nfspbuf: 336, 128, 0, 0, 0, 0, 0
XX mdpbuf: 336, 25, 0, 0, 0, 0, 0
XX clpbuf: 336, 128, 0, 35, 2918, 0, 0
XX vnpbuf: 336, 2048, 0, 0, 0, 0, 0
XX pbuf: 336, 16, 0, 2505, 0, 0, 0
pbuf should har 2537 preallocated and FREE initially, but seems to actually
have 2540. pbufs were only used for clustering, and 35 of them were moved
from pbuf to clpbuf.
In the buggy version, the preallocations stopped after 4. Then clustering
presumably moved these 4 to clpbuf. After that, clustering presumably used
non-preallocated buffers until it reached its limit, and then recycled its
own buffers.
What should happen to recover the old overcommit behaviour with better
debugging is 256 preallocated buffers (a few more for large systems) in
pbuf and moving these to other pools, but never allocating from other
pools (keep buffers in other pools only as an optimization and release
them to the main pool under pressure). Also allow dynamic tuning of
the pool[s] size[s]. The vnode cache does essentially this by using
1 overcommitted pool with unlimited size in uma and external management
of the size. The separate pools correspond to separate file systems.
These are too hard to manage, so the vnode cache throws everything into
the main pool and depends on locality for the overcommit to not be too
large.
Bruce
_______________________________________________
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"