We are discussing this issue on Github: https://github.com/golang/go/issues/23360
On Sunday, January 7, 2018 at 11:22:52 PM UTC+8, she...@pingcap.com wrote: > > #1 0x00000000004293f2 in runtime.futexsleep (addr=0x1b0a950 > <runtime.m0+272>, val=0, ns=-1) at /usr/local/go/src/runtime/os_linux.go:45 > > > I dive into the source code of golang 1.9.2 and find this: > https://github.com/golang/go/blob/bf9ad7080d0a22acf502a60d8bc6ebbc4f5340ef/src/runtime/os_linux.go#L45 > >> // Some Linux kernels have a bug where futex of >> // FUTEX_WAIT returns an internal error code >> // as an errno. Libpthread ignores the return value >> // here, and so can we: as it says a few lines up, >> // spurious wakeups are allowed. >> if ns < 0 { >> futex(unsafe.Pointer(addr), _FUTEX_WAIT, val, nil, nil, 0) >> return >> } > > > Could I say that I meet a bug of the Linux kernel? The kernel is > "3.10.0-514.el7.x86_64 > #1 SMP Tue Nov 22 16:42:41 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux" > > On Sunday, January 7, 2018 at 7:57:52 PM UTC+8, she...@pingcap.com wrote: >> >> The same problem occurs again with the same error message. >> >> The pstack result: >> >>> Thread 1 (process 12230): >>> #0 runtime.futex () at /usr/local/go/src/runtime/sys_linux_amd64.s:439 >>> #1 0x00000000004293f2 in runtime.futexsleep (addr=0x1b0a950 >>> <runtime.m0+272>, val=0, ns=-1) at /usr/local/go/src/runtime/os_linux.go:45 >>> #2 0x000000000040f9ab in runtime.notesleep (n=0x1b0a950 >>> <runtime.m0+272>) at /usr/local/go/src/runtime/lock_futex.go:151 >>> #3 0x00000000004315f5 in runtime.stopm () at >>> /usr/local/go/src/runtime/proc.go:1680 >>> #4 0x00000000004327d2 in runtime.findrunnable (gp=0x45cbd1 >>> <runtime.goexit+1>, inheritTime=false) at >>> /usr/local/go/src/runtime/proc.go:2135 >>> #5 0x000000000043328c in runtime.schedule () at >>> /usr/local/go/src/runtime/proc.go:2255 >>> #6 0x00000000004335a6 in runtime.park_m (gp=0xca01842780) at >>> /usr/local/go/src/runtime/proc.go:2318 >>> #7 0x0000000000459ffb in runtime.mcall () at >>> /usr/local/go/src/runtime/asm_amd64.s:286 >>> #8 0x0000000001b0a100 in ?? () >>> #9 0x00007ffdc5be26e0 in ?? () >>> #10 0x0000000001b0a100 in ?? () >>> #11 0x00007ffdc5be26d0 in ?? () >>> #12 0x0000000000430544 in runtime.mstart () at >>> /usr/local/go/src/runtime/proc.go:1152 >>> #13 0x0000000000459ec1 in runtime.rt0_go () at >>> /usr/local/go/src/runtime/asm_amd64.s:186 >>> #14 0x000000000000000a in ?? () >>> #15 0x00007ffdc5be2718 in ?? () >>> #16 0x000000000000000a in ?? () >>> #17 0x00007ffdc5be2718 in ?? () >>> #18 0x0000000000000000 in ?? () >>> >> >> >> On Sunday, January 7, 2018 at 7:52:00 PM UTC+8, she...@pingcap.com wrote: >>> >>> We enable race detection in the test environment and disable it when >>> building to be published binaries. >>> I double checked the building environment to make sure the race >>> detection is disabled. For we care the performance very much. >>> >>> On Saturday, January 6, 2018 at 7:04:09 PM UTC+8, Dave Cheney wrote: >>>> >>>> You can still check for races if you build your production binary with >>>> -race and deploy it as normal. There will be a some performance hit so you >>>> probably shouldn't do this for all your binaries, but it will be a cheap >>>> way to flush out any data races in your code. >>>> >>>> On Saturday, 6 January 2018 21:15:56 UTC+11, she...@pingcap.com wrote: >>>>> >>>>> Thanks for your advice! I got the error message and the pstack result >>>>> screenshot from one of our client. I will try to use some OCR tools to >>>>> convert the image to text next time. >>>>> >>>>> For the questions: >>>>> 1. The binary is built without race detector flag. I have checked it. >>>>> 2. We do not use cgo. I will check if there is any unsafe package. >>>>> >>>>> Unfortunately, I can not reproduce this problem. This is the first >>>>> time and the only time I meet it. >>>>> >>>>> Thanks! >>>>> >>>>> On Saturday, January 6, 2018 at 1:28:24 AM UTC+8, Ian Lance Taylor >>>>> wrote: >>>>>> >>>>>> On Fri, Jan 5, 2018 at 7:17 AM, <she...@pingcap.com> wrote: >>>>>> > >>>>>> > I meet a strange problem when running a program on Linux. I get >>>>>> "fatal: >>>>>> > morestack on g0" from stderr. The process is still there but does >>>>>> not >>>>>> > respond anymore. When I use `curl >>>>>> > http://ip:port/debug/pprof/goroutine?debug=1` to check the stack, >>>>>> but it >>>>>> > halts. There is nothing useful in stderr or dmesg. >>>>>> >>>>>> (I would like to encourage you and everyone else to not post >>>>>> screenshots of text. Just include the text in the e-mail message as >>>>>> text. Screenshots of images, sure, but not text. Thanks.) >>>>>> >>>>>> The error "fatal: morestack on g0" should, of course, never happen. >>>>>> The first questions are standard: have you run your program with the >>>>>> race detector? Do you use cgo or the unsafe package? >>>>>> >>>>>> Beyond that, does the problem happen consistently? Is there a way >>>>>> that we can reproduce it? >>>>>> >>>>>> Ian >>>>>> >>>>> -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.