On Sun, Jul 5, 2020 at 1:05 PM Ian Lance Taylor <i...@golang.org> wrote:
> On Sun, Jul 5, 2020 at 10:54 AM Marcin Romaszewicz <marc...@gmail.com> > wrote: > > > > I'm hitting a problem using os.exec Cmd.Start to run a process. > > > > I'm setting Cmd.Stdio and Cmd.Stderr to the same instance of an io.Pipe, > and spawn a Goroutine to consume the pipe reader until I reach EOF. I then > call cmd.Start(), do some additional work, and call cmd.Wait(). The runtime > of the executable I launch is 15-30 minutes, and stdout/stderr output is > minimal, a few 10's of kB during this 15-30 minute run. > > > > When the pipe reaches EOF or errors out, I close the pipe reader, exit > the goroutine reading the pipe, and that's when cmd.Wait() returns, exactly > as documented. > > > > This works exactly as described about 70% of the time. The remaining 30% > of the time, cmd.Wait() returns an error, which stringifies as "signal: > broken pipe". I'm running thousands of copies of this executable across > thousands of instances in AWS, so I have a big data set here. The broken > pipe error happens at the very end when my exec'd executable is exiting, so > as far as I can tell, it's run successfully and is hitting this error on > exit. > > > > I realize that SIGPIPE and EPIPE are common ways that processes clean > each other up, and that shells do a lot of work hiding them, so I've also > tried using exec.Cmd to spawn bash, which in turn runs my executable, but I > still get a lot of these deaths due to SIGPIPE. > > > > I've tried to reproduce this with simple commands - like `cat > <longfile.txt>`, and none of these simple commands ever result in the > broken pipe, and I capture all their output without issue. The command I'm > running differs in that it uses quite a lot of resources and the machine is > doing significant work when the executable is exiting. However, the sigpipe > is being received by the application, not my Go code, implying that the Go > side is closing the pipe. I can't find where this is happening. > > > > Any tips on how to chase this down? > > The executable is dying due to receiving a SIGPIPE signal. As you > know, that means that it made a write system call to a pipe that had > no open readers. If you're confident that you are reading all the > data from the pipe in the Go program, then the natural first thing to > check is the other possible pipe: if you are reading from stdout, > check what happens on stderr, and vice-versa. > > Since that probably won't help, since you can reproduce it with some > reliability, try running the whole system under strace -f. That will > show you the system calls both of your program and of the subprocess, > and should let you determine exactly which write is triggering the > SIGPIPE, and let you verify that the read end of the pipe has been > closed. > > And if that doesn't help, perhaps you can modify the subprocess to > catch SIGPIPE and get a stack trace, again with the goal of finding > out exactly what write is failing. > > Hope this helps. > Thanks for the tips. The comment on Stdout and Stderr on cmd says: // If Stdout and Stderr are the same writer, and have a type that can// be compared with ==, at most one goroutine at a time will call Write. Using an io.Pipe shared between these two should result in both being drained correctly, right? > Ian > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CA%2Bv29Lte-C%2BfZ8U4YAiOA%3DLrUJ1ZUMcZakDpQ4nOO0%2Bj_Ts%2B7A%40mail.gmail.com.