I am pretty sure runtime is supposed to crash the process if it slows the 
allocators “too much” (I believe there are some config settings to control 
this).

If you have enough Go routines it may look like they are hung - you need to 
track specific routines by their ID. The stack certainly looks like it is 
trying to allocate memory for a slice - and it is being paused - because it is 
waiting on the GC lock.

Are you doing any native code? You could have heap corruption and that is also 
why the GC thread is getting hung.

If the GC thread makes progress albeit slowly - I would settle on my first 
guess - memory leak...

> On May 29, 2024, at 10:00 AM, 'Alex Kristiansen' via golang-nuts 
> <golang-nuts@googlegroups.com> wrote:
> 
> That's an interesting idea, I probably wouldn't have thought of that on my 
> own. Is that expected behavior for memory pressure on Windows+golang? I don't 
> have much windows experience, so my assumption would be that the Windows 
> equivalent of the OOMKiller would kick in and just kill the application. 
> 
> On Tuesday, May 28, 2024 at 4:04:15 PM UTC-7 robert engels wrote:
> Feels like a memory leak to me. I would look for growing heap size in the gc 
> logs. I am guessing that the system is not completely hung - but rather the 
> runtime is having a hard time obtaining more memory, so it is slowing the 
> allocators to a rate that makes them appear hung.
> 
> It may be that the process has consumed nearly all of the OS memory too - so 
> the OS is having a hard-time responding to malloc requests.
> 
> i.e. The system is not making progress.
> 
> 
>> On May 28, 2024, at 3:54 PM, 'Alex Kristiansen' via golang-nuts 
>> <golan...@googlegroups.com 
>> <applewebdata://C24C9E9A-BB88-45E7-A759-F8AFBB9083CC>> wrote:
>> 
> 
>> This is an odd one. For reference, this is a customer machine, Windows 
>> server 2016, compiled with go1.20.11. The application just hangs after a 
>> number of days; windows minidump reveals that most threads are doing this:
>> 
>>   Goroutine 462 - User: unicode/utf16/utf16.go:106 unicode/utf16.Decode 
>> (0xe6e391) [semacquire]
>>          0  0x0000000000e2d116 in runtime.gopark
>>              at runtime/proc.go:382
>>          1  0x0000000000e3df5c in runtime.goparkunlock
>>              at runtime/proc.go:387
>>          2  0x0000000000e3df5c in runtime.semacquire1
>>              at runtime/sema.go:160
>>          3  0x0000000000e0ac2f in runtime.semacquire
>>              at runtime/sema.go:111
>>          4  0x0000000000e0ac2f in runtime.gcMarkDone
>>              at runtime/mgc.go:787
>>          5  0x0000000000128c10 in ???
>>              at ?:-1
>>          6  0x0000000000dfe7da in runtime.deductAssistCredit
>>              at runtime/malloc.go:1217
>>          7  0x0000000000dfdff0 in runtime.mallocgc
>>              at runtime/malloc.go:932
>>          8  0x0000000000e3f972 in runtime.makeslice
>>              at runtime/slice.go:103
>>          9  0x0000000000e6e391 in unicode/utf16.Decode
>>              at unicode/utf16/utf16.go:106
>>         10  0x0000000000e72a7b in syscall.UTF16ToString
>>              at syscall/syscall_windows.go:63
>>         11  0x0000000000eb7a67 in os.(*File).readdir
>>              at os/dir_windows.go:43
>>         12  0x0000000000eb72c5 in os.(*File).Readdirnames
>>              at os/dir.go:70
>>         13  0x0000000000fb623a in path/filepath.glob
>>              at path/filepath/match.go:346
>>         14  0x0000000000fb5ea5 in path/filepath.globWithLimit
>>              at path/filepath/match.go:273
>>         15  0x00000000031de255 in path/filepath.Glob
>>              at path/filepath/match.go:243
>> 
>> Multiple threads, all waiting on a semaphore inside mallocgc. The actual 
>> reason we're waiting for memory changes, obviously (sometimes a string 
>> method, or hashmap allocation, or logging call...). This behavior has 
>> remained consistent across multiple hangs across a number of weeks. After a 
>> bit of digging into the minidump, this (I think) is the thread that's 
>> holding the semaphore(s), at least based on the line numbers:
>> 
>>   Goroutine 37 - User: :0 ??? (0x7ffb0e056974) (thread 5628)
>>         0  0x00007ffb0e056974 in ???
>>             at ?:-1
>>         1  0x0000000000e5c5a0 in runtime.systemstack_switch
>>             at runtime/asm_amd64.s:463
>>         2  0x0000000000e0ade5 in runtime.gcMarkDone
>>             at runtime/mgc.go:855
>>         3  0x0000000000128c10 in ???
>>             at ?:-1
>>         4  0x0000000000e5e881 in runtime.goexit
>>             at runtime/asm_amd64.s:1598
>> 
>> Note that the precise point in gcMarkDone can change. I have another 
>> minidump showing a hang at a different point in the same method:
>> 
>> Goroutine 19 - Go: :0 ??? (0x208351516f8) (thread 7164) [unknown wait reason 
>> 30] 
>> 0 0x00007ffe66e66974 in ??? 
>>     at ?:-1 
>> 1 0x0000000000e9c5a0 in runtime.systemstack_switch 
>>     at runtime/asm_amd64.s:463 
>> 2 0x0000000000e4acff in runtime.gcMarkDone 
>>     at runtime/mgc.go:807 
>> 3 0x0000000000128c10 in ??? 
>>     at ?:-1 
>> 4 0x0000000000e9e881 in runtime.goexit 
>>     at runtime/asm_amd64.s:1598 
>> 
>> The commonality between these treads appears to be the `stacktrace()` 
>> method. 
>> 
>> Does anyone have any ideas for how to further debug this? Has anyone seen 
>> anything like this? I've never seen anything like this before, and I can't 
>> really reproduce it, as the only behavior is "let the application sit and 
>> run for a few days." The only vaguely similar issue I've found is this, 
>> which mentions interference from AV software: 
>> https://github.com/golang/go/issues/52178 
>> <https://github.com/golang/go/issues/52178>. 
>> 
>> Right now I'm waiting for the results of a gctrace log, as well as setting 
>> GODEBUG=asyncpreemptoff=1, but other than that, I'm a bit out of ideas. 
>> 
> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "golang-nuts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to golang-nuts...@googlegroups.com 
>> <applewebdata://C24C9E9A-BB88-45E7-A759-F8AFBB9083CC>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/golang-nuts/9c78ef6f-0190-4981-ab1f-ed18ad2f1080n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/golang-nuts/9c78ef6f-0190-4981-ab1f-ed18ad2f1080n%40googlegroups.com?utm_medium=email&utm_source=footer>.
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golang-nuts+unsubscr...@googlegroups.com 
> <mailto:golang-nuts+unsubscr...@googlegroups.com>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/4d063278-64ee-47f6-84c9-37e18a4814a2n%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/golang-nuts/4d063278-64ee-47f6-84c9-37e18a4814a2n%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/8AC76FEE-A990-4EAF-9494-A6439D7B3F83%40ix.netcom.com.

Reply via email to