I was having the same problem without TLS but the symptoms are exactly the same, even the part "The "loop" seems to be interruptable by triggering a new connection, or by itself after ~1m54s"
I spent 4 days trying to get a consistent way to reproduce it but coulnd't so I gave up and keept using Go 1.16. I am glad that it may be fixed in 1.19. Thank you for pointing it out! El jueves, 28 de abril de 2022 a las 18:23:25 UTC+2, klau...@gmail.com escribió: > This "smells" a lot like https://github.com/golang/go/issues/51654 - > though the repro is with TLS. > > This is very disruptive for us (MinIO) as well, seeing this at several > customers. Obviously downgrading to Go 1.16.6 is less than optimal for us. > > /Klaus > > On Tuesday, 12 April 2022 at 14:55:41 UTC+2 Brian Candler wrote: > >> Correct: "go run" builds a binary and then fork/execs the binary it has >> built, to run it. >> >> On Tuesday, 12 April 2022 at 08:19:38 UTC+1 scorr...@gmail.com wrote: >> >>> By the way, the difference that I thought I saw between go run and the >>> compiled version was because the pid of the process to look at in the below >>> example should be 63668, the tmp file, not 63548, the go run call. >>> >>> $ ps aux | grep test >>> bill 63548 0.2 0.1 2044984 17588 pts/0 Sl+ 09:15 0:00 go >>> run test.go >>> bill 63668 155 0.0 1891292 12140 pts/0 Sl+ 09:15 1:52 >>> /tmp/go-build910547164/b001/exe/test >>> bill 63750 0.0 0.0 11564 652 pts/1 S+ 09:16 0:00 grep >>> --color=auto test >>> >>> El martes, 12 de abril de 2022 a las 9:13:41 UTC+2, Santiago Corredoira >>> escribió: >>> >>>> Hi, yes, all about the toy program is the same. It was an attempt to >>>> get to a minimal example that anyone can reproduce so I could ask for >>>> help. >>>> The real problem is with a larger program where I get this 100% CPU usages >>>> ramdomly when the server is idle. >>>> >>>> I was thinking and with the basic example the nanosleep calls maybe is >>>> not the scheduller signaling itself because: >>>> >>>> - GODEBUG=asyncpreemptoff=1 has no effect >>>> - It doesn't happen with Go 1.16 >>>> - It is random. If it where the normal working of the scheduler, >>>> wouldn't it be predictable? >>>> >>>> >>>> El martes, 12 de abril de 2022 a las 8:59:06 UTC+2, Brian Candler >>>> escribió: >>>> >>>>> At the start of the thread, you said "if I compile and run this simple >>>>> program ..." >>>>> >>>>> Does the problem no longer reproduce with that? >>>>> >>>>> On Monday, 11 April 2022 at 22:59:55 UTC+1 scorr...@gmail.com wrote: >>>>> >>>>>> I tried with a more complex application and already got constant 100% >>>>>> CPU usage twice. >>>>>> >>>>>> When I attach strace I see the same as with the minimal demo but with >>>>>> much more frecuency: >>>>>> >>>>>> [pid 42659] 23:36:49.035120 epoll_pwait(3, [], 128, 0, NULL, >>>>>> 140341803388040) = 0 >>>>>> [pid 42659] 23:36:49.035133 epoll_pwait(3, [], 128, 0, NULL, >>>>>> 18704162493558) = 0 >>>>>> [pid 42659] 23:36:49.035145 epoll_pwait(3, [], 128, 0, NULL, >>>>>> 140341803388040) = 0 >>>>>> [pid 42659] 23:36:49.035158 epoll_pwait(3, [], 128, 0, NULL, >>>>>> 18704162493558) = 0 >>>>>> [pid 42659] 23:36:49.035170 epoll_pwait(3, [], 128, 0, NULL, >>>>>> 140341803388040) = 0 >>>>>> [pid 42659] 23:36:49.035182 epoll_pwait(3, [], 128, 0, NULL, >>>>>> 18704162493558) = 0 >>>>>> [pid 42659] 23:36:49.035195 epoll_pwait(3, [], 128, 0, NULL, >>>>>> 140341803388040) = 0 >>>>>> [pid 42659] 23:36:49.035207 epoll_pwait(3, [], 128, 0, NULL, >>>>>> 18704162493558) = 0 >>>>>> [pid 42659] 23:36:49.035219 epoll_pwait(3, [], 128, 0, NULL, >>>>>> 140341803388040) = 0 >>>>>> [pid 42659] 23:36:49.035232 epoll_pwait(3, [], 128, 0, NULL, >>>>>> 18704162493558) = 0 >>>>>> [pid 42659] 23:36:49.035244 epoll_pwait(3, [], 128, 0, NULL, >>>>>> 140341803388040) = 0 >>>>>> [pid 42659] 23:36:49.035257 epoll_pwait(3, [], 128, 0, NULL, >>>>>> 18704162493558) = 0 >>>>>> [pid 42659] 23:36:49.035269 epoll_pwait(3, [], 128, 0, NULL, >>>>>> 140341803388040) = 0 >>>>>> [pid 42659] 23:36:49.035281 epoll_pwait(3, [], 128, 0, NULL, >>>>>> 18704162493558) = 0 >>>>>> [pid 42659] 23:36:49.035293 epoll_pwait(3, [], 128, 0, NULL, >>>>>> 140341803388040) = 0 >>>>>> [pid 42659] 23:36:49.035306 epoll_pwait(3, [], 128, 0, NULL, >>>>>> 18704162493558) = 0 >>>>>> >>>>>> The program is too large to post here. I have tested it with go race >>>>>> and it doesn't run into any problem. Any idea of what to look for? >>>>>> >>>>>> Thank you! >>>>>> >>>>>> El lunes, 11 de abril de 2022 a las 17:08:40 UTC+2, Santiago >>>>>> Corredoira escribió: >>>>>> >>>>>>> I tried with the stripped tmp version of "go run" and after a while >>>>>>> I was able to reproduce it too. It definitely took me more time but >>>>>>> probably is just random. >>>>>>> >>>>>>> I tried "$ GODEBUG=asyncpreemptoff=1 ./test" and I kept seeing the >>>>>>> the calls every 10ms. Is this expected to happen? >>>>>>> >>>>>>> I compiled the test with "$ go1.16 build test.go" and I tried hard >>>>>>> to reproduce it but I couldn't. It is very strange because this >>>>>>> behaviour >>>>>>> was introduced in Go 1.14 and I only see it in Go 1.17. Also >>>>>>> asyncpreemptoff=1 seems to make no difference. >>>>>>> >>>>>>> I am going to work again with Go 1.18 by default to see if I get the >>>>>>> 100% CPU usage. >>>>>>> >>>>>>> El lunes, 11 de abril de 2022 a las 13:28:48 UTC+2, Brian Candler >>>>>>> escribió: >>>>>>> >>>>>>>> On Monday, 11 April 2022 at 09:26:28 UTC+1 scorr...@gmail.com >>>>>>>> wrote: >>>>>>>> >>>>>>>>> and the output keeps growing like that with every new curl request >>>>>>>>> until eventually it starts constantly printing output like this: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> [pid 143818] 10:02:22.697120 nanosleep({tv_sec=0, >>>>>>>>>> tv_nsec=10000000}, NULL) = 0 >>>>>>>>>> [pid 143818] 10:02:22.707609 nanosleep({tv_sec=0, >>>>>>>>>> tv_nsec=10000000}, NULL) = 0 >>>>>>>>>> [pid 143818] 10:02:22.718125 nanosleep({tv_sec=0, >>>>>>>>>> tv_nsec=10000000}, NULL) = 0 >>>>>>>>>> [pid 143818] 10:02:22.728753 nanosleep({tv_sec=0, >>>>>>>>>> tv_nsec=10000000}, NULL) = 0 >>>>>>>>>> [pid 143818] 10:02:22.739195 nanosleep({tv_sec=0, >>>>>>>>>> tv_nsec=10000000}, NULL) = 0 >>>>>>>>>> [pid 143818] 10:02:22.749618 nanosleep({tv_sec=0, >>>>>>>>>> tv_nsec=10000000}, NULL) = 0 >>>>>>>>>> [pid 143818] 10:02:22.760160 nanosleep({tv_sec=0, >>>>>>>>>> tv_nsec=10000000}, NULL) = 0 >>>>>>>>>> [pid 143818] 10:02:22.770669 nanosleep({tv_sec=0, >>>>>>>>>> tv_nsec=10000000}, NULL) = 0 >>>>>>>>>> [pid 143818] 10:02:22.781137 nanosleep({tv_sec=0, >>>>>>>>>> tv_nsec=10000000}, NULL) = 0 >>>>>>>>>> [pid 143818] 10:02:22.791581 nanosleep({tv_sec=0, >>>>>>>>>> tv_nsec=10000000}, NULL) = 0 >>>>>>>>>> [pid 143818] 10:02:22.802060 nanosleep({tv_sec=0, >>>>>>>>>> tv_nsec=10000000}, NULL) = 0 >>>>>>>>>> [pid 143818] 10:02:22.812540 nanosleep({tv_sec=0, >>>>>>>>>> tv_nsec=10000000}, NULL) = 0 >>>>>>>>>> [pid 143818] 10:02:22.822981 nanosleep({tv_sec=0, >>>>>>>>>> tv_nsec=10000000}, NULL) = 0 >>>>>>>>>> [pid 143818] 10:02:22.833468 nanosleep({tv_sec=0, >>>>>>>>>> tv_nsec=10000000}, NULL) = 0 >>>>>>>>>> >>>>>>>>>> >>>>>>>> That is normal, I believe, for a process with multiple runnable >>>>>>>> goroutines - one alarm/wakeup call every 10ms. >>>>>>>> >>>>>>>> What is odd is the 100% CPU utilisation. Can you attach an strace >>>>>>>> when that happens? >>>>>>>> >>>>>>>> As for the binary produced by "go run" - use "ps" to see where it >>>>>>>> is. For example: >>>>>>>> >>>>>>>> ==> sleepy.go <== >>>>>>>> package main >>>>>>>> >>>>>>>> import ( >>>>>>>> "sync" >>>>>>>> "time" >>>>>>>> ) >>>>>>>> >>>>>>>> func main() { >>>>>>>> var wg sync.WaitGroup >>>>>>>> wg.Add(1) >>>>>>>> go func() { >>>>>>>> time.Sleep(30 * time.Second) >>>>>>>> wg.Done() >>>>>>>> }() >>>>>>>> wg.Wait() >>>>>>>> } >>>>>>>> ==> end <== >>>>>>>> >>>>>>>> $ go run sleepy.go >>>>>>>> >>>>>>>> ... in another window ... >>>>>>>> $ ps auxwww | grep sleepy >>>>>>>> ubuntu 17743 0.0 0.1 1237424 12428 pts/1 Sl+ 11:23 0:00 >>>>>>>> go run sleepy.go >>>>>>>> ubuntu 17782 0.0 0.0 702468 1052 pts/1 Sl+ 11:23 0:00 >>>>>>>> /tmp/go-build2834561198/b001/exe/sleepy >>>>>>>> >>>>>>>> You can 'cp' it somewhere to keep it safe. >>>>>>>> >>>>>>>> If I compare this with the output of "go build sleepy.go", I see >>>>>>>> that the "go build" version is larger and unstripped: >>>>>>>> >>>>>>>> $ cp /tmp/go-build2834561198/b001/exe/sleepy sleepy-run >>>>>>>> $ file sleepy-run >>>>>>>> sleepy-run: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), >>>>>>>> statically linked, stripped >>>>>>>> $ go build sleepy.go >>>>>>>> $ file sleepy >>>>>>>> sleepy: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), >>>>>>>> statically linked, not stripped >>>>>>>> >>>>>>>> Then "strip sleepy" gives me a binary which is almost, but not >>>>>>>> quite, the same size as the one from "go run". So that gives you >>>>>>>> another >>>>>>>> way to try to reproduce the problem - although if the difference is >>>>>>>> just >>>>>>>> between stripped and unstripped, then it's most likely some sort of >>>>>>>> edge >>>>>>>> timing case. >>>>>>>> >>>>>>> -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/d9d94a10-0689-420e-9c3f-210db7249566n%40googlegroups.com.