Volker Quetschke wrote:
Christopher Faylor wrote:On Wed, Oct 19, 2005 at 03:45:30PM -0400, Volker Quetschke wrote: (snip) Given the number of changes that have been made to cygwin, particularly in /proc handling, it's very difficult for me to believe that you are not seeing *any* differences in behavior and
Well, there are differences in the frequency of occurrence of the hangs.
I'm wondering if you're actually seeing what you think you're seeing, i.e., I'm wondering if the process is just timing out and you are attributing it coming "unstuck" to the fact that you're doing "ls /proc/*/fd". I can't see any reason why inspecting /proc should cause any kind of special behavior in the latest snapshots since /proc handling now occurs in its own thread.I can completely understand your worries. My problem is that I cannot reproduce the problem myself and all I can do is ask the people who have this problem to try get some debug information. I just asked for a confirmation that it really is the "ls /proc/*/fd" that "unstucks" the process. I don't believe that "/usr/bin/tcsh -fc pwd" needs a long time to finish so that we're getting a coincidence there.
I got some information back: It is done like this, the build is running/hanging in one shell (1). When it hangs, start a new tcsh shell (2) and get the ps and cygcheck information. Then open a new bash (3) and start "strace -p <pidhang>" Now in (2) start while 1 ls /proc/<pidhang>/fd end until the strace is ready. Some details: The build is running on a local NTFS drive. It's a dedicated machine, not much is running beside the build. He wrote that 20051019 also produced a hang and that I'll get the next ;) strace. Clueless Volker
Having said that, I never realized that before, maybe the problem really lies in this special command. I mean due to some historic quirks every makefile in the OOo tree has a line that sets a macro to the current path using that command, but there are still lots of other commands (also executed in a tcsh shell) in these makefiles that I never heard of to hang. (I'll also verify that what I just said is really true, it's just an idea.)I could almost convince myself that there was a race in /proc handling before but I could never convince myself that doing something like "ls /proc/*/fd" would have any effect on it. Nevertheless, I did make some changes to eliminate the potential source of hangs in this code. So, I can't understand why you wouldn't see something different.I have no clue either, especially as I also cannot reproduce and therefore cannot pinpoint the problem. :( Anyway, thanks for all your efforts! Volker
-- PGP/GPG key (ID: 0x9F8A785D) available from wwwkeys.de.pgp.net key-fingerprint 550D F17E B082 A3E9 F913 9E53 3D35 C9BA 9F8A 785D
signature.asc
Description: OpenPGP digital signature