Just to clarify, when I said "publicly visible" I meant via blog posts and talks. There are a few design <https://github.com/golang/proposal/blob/master/design/30333-smarter-scavenging.mdhttps://github.com/golang/proposal/blob/master/design/30333-smarter-scavenging.md> documents <https://github.com/golang/proposal/blob/master/design/35112-scaling-the-page-allocator.md#scavenging> and runtime-internal comments <https://cs.opensource.google/go/go/+/master:src/runtime/mgcscavenge.go;l=5?q=mgcscavenge.go&ss=go%2Fgo> that go into more depth.
On Monday, June 20, 2022 at 5:46:36 PM UTC-4 Michael Knyszek wrote: > Thanks for the question. The scavenger isn't as publicly visible as other > parts of the runtime. You've got it mostly right, but I'm going to repeat > some things you've already said to make it clear what's different. > > The Go runtime maps new heap memory (specifically: a new virtual memory > mapping for the heap) as read/write in increments called arenas. (Note: my > use of "heap" here is a little loose; that pool of memory is also used for > e.g. goroutine stacks.) The concept of arena is carried forward to how GC > metadata is managed (chunk of metadata per arena) but is otherwise > orthogonal to everything else I'm about to describe. To the scavenger, the > concept of an arena doesn't really exist. > > The platform (OS + architecture) has some underlying physical page size > (typically between 4 and 64 KiB, inclusive), but Go has an internal page > size of 8 KiB. It divides all of memory up into these 8 KiB pages, > including heap memory. > > The runtime assumes, in general, that new virtual memory is not backed by > physical memory until first use (or an explicit system call on some > platforms, like Windows). As free pages get allocated for the heap (for > spans, as you say), they are assumed to be backed by physical memory. Once > those pages are released, they are still assumed to be backed by physical > memory. > > This is where the scavenger comes in: it tells the OS that these free > regions of the address space, which it assumes are backed by physical > pages, are no longer needed in the short term. So, the OS is free to take > the physical memory back. "Telling the OS" is the madvise system call on > Linux platforms. Note that the Go runtime could be wrong about whether the > region is backed by physical memory; that's fine, madvise is just a hint > anyway (a really useful one). (Also, it's really unlikely to be wrong, > because memory needs to be zeroed before it's handed to the application. > Still, it's theoretically possible.) > > The scavenger doesn't really have any impact on fragmentation, because the > Go runtime is free to allocate a span out of a mix of scavenged and > unscavenged pages. When it's actively scavenging, it briefly takes those > pages out of the allocation pool, which can affect fragmentation, but the > system is organized such that such a collision (and thus potentially some > fragmentation) is less likely. > > The result is basically just fewer physical pages consumed by Go > applications (what "top" reports as "RSS") at the cost of about 1% of total > CPU time. The CPU cost, however, is usually much less; 1% is just the > target while it's active, but in the steady-state there's typically not too > much work to do. > > The Go runtime also never unmaps heap memory, because virtual memory > that's guaranteed to not be backed by physical memory is very cheap (likely > just a single interval in some OS bookkeeping). Unmapping virtual address > space is also fairly expensive in comparison to madvise, so it's worthwhile > to avoid. > > I don't fully understand what you mean by "layered cake" in this context. > The memory allocator in general is certainly a "layered cake," but the > scavenger just operates directly on the pool of free pages (which again, > don't have much to do with arenas other than that happens to be the > increment that new pages are added to the pool). > > There's also two additional complications to all of this: > (1) Because the Go runtime's page size doesn't usually match the system's > physical page size, the scavenger needs to be careful to only return > contiguous and aligned runs of pages that add up to the physical page size. > This makes it less effective on platforms with physical page larger than 8 > KiB because fragmentation can prevent an entire physical page from being > free. This is fine, though; the scavenger is most useful when, for example, > the heap size shrinks significantly. Then there's almost always a large > swathe of available free pages. Note also that platforms with smaller > physical page sizes are fine, because every scavenge operation releases > some multiple of physical pages. > (2) The Go runtime tries to take into account transparent huge pages as > well. That's its own can of worms that I won't go into for now. > > > On Monday, June 20, 2022 at 11:48:02 AM UTC-4 vitaly...@gmail.com wrote: > >> Go allocator requests memory from OS in large arenas (on Linux x86_64 the >> size of arena is 64Mb), then allocator splits each arena to 8 Kb pages, >> than merges pages in spans of different sizes (from 8kb to 80kb size >> according to https://go.dev/src/runtime/sizeclasses.go). This process is >> well described in various blog posts and presentations. >> > >> But there is much less information about scavenger. Is it true that in >> contrast to allocation process, scavenger reclaims to OS not arenas, but >> pages underlying idle spans? This performed with madvice(MADV_DONT_NEED). >> > >> >> If so, Am I correct that after a while the virtual address space of a Go >> application resembles a "layered cake" of interleaving used and reclaimed >> memory regions (kind of classic memory fragmentation problem)? Looks like >> if application requires more virtual memory after some time, the OS won't >> be able to reuse these page-size regions to allocate contiguous space >> sufficient for arena allocation. >> >> Are there any consequences of this design for the runtime performance, >> especially for the RSS consumption? >> >> Finally, how does runtime decide, what to use - munmap or madvice - for >> the purposes of memory reclamation? >> >> Thank you >> > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/5f31afc6-3ee7-4b77-90f0-f0e117f858ben%40googlegroups.com.