Thanks for the question. The scavenger isn't as publicly visible as other 
parts of the runtime. You've got it mostly right, but I'm going to repeat 
some things you've already said to make it clear what's different.

The Go runtime maps new heap memory (specifically: a new virtual memory 
mapping for the heap) as read/write in increments called arenas. (Note: my 
use of "heap" here is a little loose; that pool of memory is also used for 
e.g. goroutine stacks.) The concept of arena is carried forward to how GC 
metadata is managed (chunk of metadata per arena) but is otherwise 
orthogonal to everything else I'm about to describe. To the scavenger, the 
concept of an arena doesn't really exist.

The platform (OS + architecture) has some underlying physical page size 
(typically between 4 and 64 KiB, inclusive), but Go has an internal page 
size of 8 KiB. It divides all of memory up into these 8 KiB pages, 
including heap memory.

The runtime assumes, in general, that new virtual memory is not backed by 
physical memory until first use (or an explicit system call on some 
platforms, like Windows). As free pages get allocated for the heap (for 
spans, as you say), they are assumed to be backed by physical memory. Once 
those pages are released, they are still assumed to be backed by physical 
memory.

This is where the scavenger comes in: it tells the OS that these free 
regions of the address space, which it assumes are backed by physical 
pages, are no longer needed in the short term. So, the OS is free to take 
the physical memory back. "Telling the OS" is the madvise system call on 
Linux platforms. Note that the Go runtime could be wrong about whether the 
region is backed by physical memory; that's fine, madvise is just a hint 
anyway (a really useful one). (Also, it's really unlikely to be wrong, 
because memory needs to be zeroed before it's handed to the application. 
Still, it's theoretically possible.)

The scavenger doesn't really have any impact on fragmentation, because the 
Go runtime is free to allocate a span out of a mix of scavenged and 
unscavenged pages. When it's actively scavenging, it briefly takes those 
pages out of the allocation pool, which can affect fragmentation, but the 
system is organized such that such a collision (and thus potentially some 
fragmentation) is less likely.

The result is basically just fewer physical pages consumed by Go 
applications (what "top" reports as "RSS") at the cost of about 1% of total 
CPU time. The CPU cost, however, is usually much less; 1% is just the 
target while it's active, but in the steady-state there's typically not too 
much work to do.

The Go runtime also never unmaps heap memory, because virtual memory that's 
guaranteed to not be backed by physical memory is very cheap (likely just a 
single interval in some OS bookkeeping). Unmapping virtual address space is 
also fairly expensive in comparison to madvise, so it's worthwhile to avoid.

I don't fully understand what you mean by "layered cake" in this context. 
The memory allocator in general is certainly a "layered cake," but the 
scavenger just operates directly on the pool of free pages (which again, 
don't have much to do with arenas other than that happens to be the 
increment that new pages are added to the pool).

There's also two additional complications to all of this:
(1) Because the Go runtime's page size doesn't usually match the system's 
physical page size, the scavenger needs to be careful to only return 
contiguous and aligned runs of pages that add up to the physical page size. 
This makes it less effective on platforms with physical page larger than 8 
KiB because fragmentation can prevent an entire physical page from being 
free. This is fine, though; the scavenger is most useful when, for example, 
the heap size shrinks significantly. Then there's almost always a large 
swathe of available free pages. Note also that platforms with smaller 
physical page sizes are fine, because every scavenge operation releases 
some multiple of physical pages.
(2) The Go runtime tries to take into account transparent huge pages as 
well. That's its own can of worms that I won't go into for now.

On Monday, June 20, 2022 at 11:48:02 AM UTC-4 vitaly...@gmail.com wrote:

> Go allocator requests memory from OS in large arenas (on Linux x86_64 the 
> size of arena is 64Mb), then allocator splits each arena to 8 Kb pages, 
> than merges pages in spans of different sizes (from 8kb to 80kb size 
> according to https://go.dev/src/runtime/sizeclasses.go). This process is 
> well described in various blog posts and presentations. 
>

> But there is much less information about scavenger. Is it true that in 
> contrast to allocation process, scavenger reclaims to OS not arenas, but 
> pages underlying idle spans? This performed with madvice(MADV_DONT_NEED). 
>

>
> If so, Am I correct that after a while the virtual address space of a Go 
> application resembles a "layered cake" of interleaving used and reclaimed 
> memory regions (kind of classic memory fragmentation problem)? Looks like 
> if application requires more virtual memory after some time, the OS won't 
> be able to reuse these page-size regions to allocate contiguous space 
> sufficient for arena allocation.
>
> Are there any consequences of this design for the runtime performance, 
> especially for the RSS consumption?
>
> Finally, how does runtime decide, what to use - munmap or madvice - for 
> the purposes of memory reclamation?
>
> Thank you
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/32cb8862-2541-48ef-8e33-97d4f8202638n%40googlegroups.com.

Reply via email to