Igor Peshansky wrote:
On Thu, 30 Mar 2006, David Carter wrote:
It appears to me that by opening the file as O_TEXT, that gawk is
hanging because it is waiting for that LF char to follow the CR (which
never comes). Does this sound likely to you?
If this theory were true, "echo -ne 'aa\rb' | gawk '{print $0}'" would
hang. It doesn't for me, even with textmode pipes...
Yes, I realized this myself soon after posting. Your echo command
doesn't hang for me either. As I said in my original post, this is one
of those annoying bugs that if I try to make it hang interactively, it
always works correctly (never hangs), but if I try to do it with my
regular script, it (usually, but not always) hangs. This is another
clue that my initial "theory" was incorrect: if it were true, the
program would hang regardless.
Here's an example line, callable from a prompt, that usually hangs:
$ rsync -Pv sourcefile rmachine:/rpath/ | \
gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'
To test this, I recommend using a source/remote combination for rsync
that will take about 30 seconds to a minute to complete. This will
create enough output for gawk to replicate the issue.
If this hangs (it may not hang the first time; give it 2 or 3 runs),
you'll stop getting output to stdout and it will just sit there. If you
go to another prompt to do a ps, you'll see that rsync is done running
but gawk is still sitting there. CTRL+C in the window running the script
does nothing. You need to kill the gawk process from another bash prompt.
Try saving the output of rsync to file and running gawk over that
separately...
Good idea. Per your advice, I tried doing something like the following:
$ rsync -Pv sourcefile rmachine:/rpath/ > rsync.out
$ cat rsync.out | \
gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'
Surprisingly, that code never hangs. Also, this never hangs:
$ rsync -Pv sourcefile rmachine:/rpath/ | xxd | xxd -r | \
gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'
However, this usually hangs:
$ rsync -Pv sourcefile rmachine:/rpath/ | cat |
gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'
Also, if gawk really hangs, you can run it under strace to
see exactly what it was doing up to the hang (but please don't post the
strace output unless you're asked to do so by Corinna or CGF).
I tried something like the following:
$ rsync -Pv sourcefile rmachine:/rpath/ | strace \
gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'
But, unfortunately, this never hangs. So I tried this:
$ ( sleep 10; rsync -Pv sourcefile rmachine:/rpath/ ) | \
gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}'
and then I go to another window and start strace on the gawk PID. This
hangs (usually). Looking at the strace output, the last thing gawk does is:
87 22612601 [read_pipe] gawk 188 fhandler_base::read: returning 1,
text mode
Every time it hangs, I get "read returning 1, text mode". If I look at
strace output for the sucessful (non-hanging) executions, i never get a
"read returning 1, text mode."
All of this makes me wonder if:
a) rsync is perhaps doing something with its stdout file descriptor
that it shouldn't be doing, or that;
b) gawk is perhaps doing something with its stdin file descriptor
that it shouldn't be doing.
If a), then why doesn't it break when I just redirect the output of
rsync to a file? If b), then what is it about piping the output of rsync
to gawk that is different (from gawk's point of view) than when I just
save the rsync output to a file and then send the contents of the file
to gawk?
And another thing...why would any of this make any difference if gawk
opens the file as O_TEXT vs O_BINARY?
HTH,
It was a great help. Thanks, Igor. Any other light you can shed is much
appreciated.
Regards;
David Carter
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/