[go-nuts] Go Stack Design Proposal

Arseny Samoylov Mon, 03 Nov 2025 01:25:07 -0800

Hello everyone,

I'd like to get feedback on an idea for changing how Go manages goroutine 
stack growth.
Below is a short draft of the proposal.

## Current State

Currently, almost every Go function prologue includes a stack growth check.
If the remaining stack space is insufficient, the runtime allocates a
larger stack and copies the old one, adjusting pointers to local variables
as needed.

**Drawbacks of this approach:**

* Increased CPU usage due to frequent stack size checks and possible
reallocations
* Larger code size because of the additional prologue instructions

## Proposed Stack Management Mechanism
I would like to hear your opinion on the following stack growth mechanism
and whether it's worth exploring further.
If you think that this idea has potential, I'll continue by estimating its
effect on CPU usage and code size and, if estimations will look good
enough, make a proof of concept.

### Reallocation via Page Faults
The idea is inspired by how Linux manages system thread stacks.

In Linux, each thread reserves (by default) 8 MB of virtual memory for its
stack. Physical memory is mapped lazily - new pages are allocated when the
thread touches them, via page faults.
When the stack limit is reached, the program aborts.

In Go, however, instead of aborting, we could reuse the existing stack
growth logic - relocating the stack to a larger chunk when a page fault
occurs near the stack boundary.

**Potential drawbacks**
* The Go runtime would need to handle page faults:
* This might increase the number of page faults and add handling
overhead
* It could be tricky to distinguish between stack-related and unrelated
page faults

* A large number of goroutines will consume a large amount of virtual
address space
* The minimal stack size would effectively increase from 2 KB to 4 KB (one
physical page). In the worst case, when all goroutine use <2Kb stack space,
this will double memory consumption
* This mechanism would depend on OS-level signal handling and may require
platform-specific implementations

The main concern, as I see it, is the increased use of virtual address
space.
A rough estimation:
100k goroutines with 8 MB stacks each would reserve ~800 GB (=2^3 * 10^5 *
2^20 ~ 2^38 B), i.e., about 1/1000 of the 2^48 bit virtual address space.
This seems acceptable, especially since we can reserve less than 8 MB.

The second concern is the larger minimum stack size (4 KB vs 2 KB). This
could double memory consumption in the worst case.
I'm not yet sure whether this trade-off would be acceptable or if it can be
mitigated.

Also, the cross-platform support is a major concern.

## Additional notes
* The current implementation supports stack shrinking (when less than 1/4
of the stack is used). I guess we can shrink stack with MADV_DONTNEED.
* Stack growth checks are currently tied to the goroutine preemption.
Removing them might indirectly affect the scheduler. However, Go has other
cooperative/asynchronous preemption, so this may not be a major issue.

### Conclusion
What do you think about this idea?
Is this direction worth further exploration? To get some concrete
performance improvement estimations and make PoC?

Thank you for your time and feedback!

--
You received this message because you are subscribed to the Google Groups
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/golang-nuts/f608147c-24ae-4126-96f4-753e4c5990ffn%40googlegroups.com.

[go-nuts] Go Stack Design Proposal

Reply via email to