I'm debugging an odd issue with Process.Wait, and wondering if anyone would 
have any ideas / hit this before:

I have a go process, which starts a child process (it sets DEATH_SIGNAL on 
that process, not sure if it's relevant).
Then, a background go routine runs Wait on that process in order to send a 
signal to the main go routine which causes it to shutdown.

A separate process sends `kill -9` to the child process, and then waits 
until the go process exits.
This separate process waits up to a minute before quitting, and erroring 
out. We're reaching that timeout, and decided to look further into it.

This is what we see:
1. The child process is a zombie
2. The go process is alive and well
3. The go process is wainting on the child process (we double checked the 
pid)
4. It's stuck in this stack for that whole minute:
```
goroutine 22 [syscall]:                                                    
                                                                            
                                                                            
                                                       
syscall.Syscall6(0xf7, 0x1, 0x136b, 0xc420034de8, 0x1000004, 0x0, 0x0, 
0x100090100000000, 0x0, 0x30)                                              
                                                                            
                                                            
        
/nix/store/yvw2v009phsdwj191jm4j2wsk28b2gxx-go-1.9.2/share/go/src/syscall/asm_linux_amd64.s:44
 
+0x5 fp=0xc420034d90 sp=0xc420034d88 pc=0x4784e5                            
                                                                            
                           
os.(*Process).blockUntilWaitable(0xc4200b4e10, 0xc420034ec8, 0x48e364, 
0x136b)                                                                    
                                                                            
                                                            
        
/nix/store/yvw2v009phsdwj191jm4j2wsk28b2gxx-go-1.9.2/share/go/src/os/wait_waitid.go:31
 
+0xa5 fp=0xc420034e98 sp=0xc420034d90 pc=0x493775                          
                                                                            
                                    
os.(*Process).wait(0xc4200b4e10, 0x0, 0x0, 0xc4200b4e10)                    
                                                                            
                                                                            
                                                      
        
/nix/store/yvw2v009phsdwj191jm4j2wsk28b2gxx-go-1.9.2/share/go/src/os/exec_unix.go:22
 
+0x42 fp=0xc420034f20 sp=0xc420034e98 pc=0x48dd12                          
                                                                            
                                      
os.(*Process).Wait(0xc4200b4e10, 0xc4200b4e10, 0x0, 0x0)                    
                                                                            
                                                                            
                                                      
        
/nix/store/yvw2v009phsdwj191jm4j2wsk28b2gxx-go-1.9.2/share/go/src/os/exec.go:115
 
+0x2b fp=0xc420034f50 sp=0xc420034f20 pc=0x48d33b                          
                                                                            
                                          
```
5. This happens rarely, and only happens when the system is under a lot of 
stress, so our best guess is that the system is taking a long time to 
respond to the syscall (which is odd because we were able to abort the go 
process, and run ps aux without issues - it doesn't look like a completely 
stalled system), or some other issue.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to