Just to clarify, when I said "publicly visible" I meant via blog posts and 
talks. There are a few design 
<https://github.com/golang/proposal/blob/master/design/30333-smarter-scavenging.mdhttps://github.com/golang/proposal/blob/master/design/30333-smarter-scavenging.md>
 
documents 
<https://github.com/golang/proposal/blob/master/design/35112-scaling-the-page-allocator.md#scavenging>
 
and runtime-internal comments 
<https://cs.opensource.google/go/go/+/master:src/runtime/mgcscavenge.go;l=5?q=mgcscavenge.go&ss=go%2Fgo>
 
that go into more depth.

On Monday, June 20, 2022 at 5:46:36 PM UTC-4 Michael Knyszek wrote:

> Thanks for the question. The scavenger isn't as publicly visible as other 
> parts of the runtime. You've got it mostly right, but I'm going to repeat 
> some things you've already said to make it clear what's different.
>
> The Go runtime maps new heap memory (specifically: a new virtual memory 
> mapping for the heap) as read/write in increments called arenas. (Note: my 
> use of "heap" here is a little loose; that pool of memory is also used for 
> e.g. goroutine stacks.) The concept of arena is carried forward to how GC 
> metadata is managed (chunk of metadata per arena) but is otherwise 
> orthogonal to everything else I'm about to describe. To the scavenger, the 
> concept of an arena doesn't really exist.
>
> The platform (OS + architecture) has some underlying physical page size 
> (typically between 4 and 64 KiB, inclusive), but Go has an internal page 
> size of 8 KiB. It divides all of memory up into these 8 KiB pages, 
> including heap memory.
>
> The runtime assumes, in general, that new virtual memory is not backed by 
> physical memory until first use (or an explicit system call on some 
> platforms, like Windows). As free pages get allocated for the heap (for 
> spans, as you say), they are assumed to be backed by physical memory. Once 
> those pages are released, they are still assumed to be backed by physical 
> memory.
>
> This is where the scavenger comes in: it tells the OS that these free 
> regions of the address space, which it assumes are backed by physical 
> pages, are no longer needed in the short term. So, the OS is free to take 
> the physical memory back. "Telling the OS" is the madvise system call on 
> Linux platforms. Note that the Go runtime could be wrong about whether the 
> region is backed by physical memory; that's fine, madvise is just a hint 
> anyway (a really useful one). (Also, it's really unlikely to be wrong, 
> because memory needs to be zeroed before it's handed to the application. 
> Still, it's theoretically possible.)
>
> The scavenger doesn't really have any impact on fragmentation, because the 
> Go runtime is free to allocate a span out of a mix of scavenged and 
> unscavenged pages. When it's actively scavenging, it briefly takes those 
> pages out of the allocation pool, which can affect fragmentation, but the 
> system is organized such that such a collision (and thus potentially some 
> fragmentation) is less likely.
>
> The result is basically just fewer physical pages consumed by Go 
> applications (what "top" reports as "RSS") at the cost of about 1% of total 
> CPU time. The CPU cost, however, is usually much less; 1% is just the 
> target while it's active, but in the steady-state there's typically not too 
> much work to do.
>
> The Go runtime also never unmaps heap memory, because virtual memory 
> that's guaranteed to not be backed by physical memory is very cheap (likely 
> just a single interval in some OS bookkeeping). Unmapping virtual address 
> space is also fairly expensive in comparison to madvise, so it's worthwhile 
> to avoid.
>
> I don't fully understand what you mean by "layered cake" in this context. 
> The memory allocator in general is certainly a "layered cake," but the 
> scavenger just operates directly on the pool of free pages (which again, 
> don't have much to do with arenas other than that happens to be the 
> increment that new pages are added to the pool).
>
> There's also two additional complications to all of this:
> (1) Because the Go runtime's page size doesn't usually match the system's 
> physical page size, the scavenger needs to be careful to only return 
> contiguous and aligned runs of pages that add up to the physical page size. 
> This makes it less effective on platforms with physical page larger than 8 
> KiB because fragmentation can prevent an entire physical page from being 
> free. This is fine, though; the scavenger is most useful when, for example, 
> the heap size shrinks significantly. Then there's almost always a large 
> swathe of available free pages. Note also that platforms with smaller 
> physical page sizes are fine, because every scavenge operation releases 
> some multiple of physical pages.
> (2) The Go runtime tries to take into account transparent huge pages as 
> well. That's its own can of worms that I won't go into for now.
>
>
> On Monday, June 20, 2022 at 11:48:02 AM UTC-4 vitaly...@gmail.com wrote:
>
>> Go allocator requests memory from OS in large arenas (on Linux x86_64 the 
>> size of arena is 64Mb), then allocator splits each arena to 8 Kb pages, 
>> than merges pages in spans of different sizes (from 8kb to 80kb size 
>> according to https://go.dev/src/runtime/sizeclasses.go). This process is 
>> well described in various blog posts and presentations. 
>>
>
>> But there is much less information about scavenger. Is it true that in 
>> contrast to allocation process, scavenger reclaims to OS not arenas, but 
>> pages underlying idle spans? This performed with madvice(MADV_DONT_NEED). 
>>
>
>>
>> If so, Am I correct that after a while the virtual address space of a Go 
>> application resembles a "layered cake" of interleaving used and reclaimed 
>> memory regions (kind of classic memory fragmentation problem)? Looks like 
>> if application requires more virtual memory after some time, the OS won't 
>> be able to reuse these page-size regions to allocate contiguous space 
>> sufficient for arena allocation.
>>
>> Are there any consequences of this design for the runtime performance, 
>> especially for the RSS consumption?
>>
>> Finally, how does runtime decide, what to use - munmap or madvice - for 
>> the purposes of memory reclamation?
>>
>> Thank you
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/5f31afc6-3ee7-4b77-90f0-f0e117f858ben%40googlegroups.com.

Reply via email to