Christopher Faylor wrote:
On Fri, May 26, 2006 at 10:37:52AM -0500, mwoehlke wrote:
Way way back in the OP, I mentioned that Interix doesn't have this
problem, which would imply a "design flaw" in Cygwin. Maybe (probably)
it is a *necessary* design flaw, BUT...
A "necessary" design flaw? Interesting concept.
Well, sure... something that is known to be problematic (I think this
qualifies) that is nevertheless necessary because of limitations in
Windoze. At least, I would have to guess that the slow code is
necessitated by Windows, since it works better on Interix and
wonderfully on Linux.
$ time ./foo
real 0m0.296s
user 0m0.000s
sys 0m0.005s
This is a 50x difference where the difference should be minimal. That
said, under Interix it still takes almost 2 seconds, which is 7x-ish
difference, but still nowhere near 50x, so the problem isn't (entirely)
the NFS client or the latency w.r.t. the share.
Summary:
exec time/ | little | big | stat
platform | script | script |
-----------+---------+-----------+----------
Linux | 0.296s | 1.105s | 0.122s(2)
Interix(1) | 1.828s | 5.536s | (3)
Cygwin(1) | 10.688s | 6m37.578s | 0.350s(4)
Notes:
1 - These are the same physical box.
2 - On Linux, 'stat' took the same amount of time on either script.
3 - My Interix installation apparently doesn't have 'stat', so this
entry is blank.
4 - The time taken on Cygwin varied a lot, up to about 1s, and was a
little (but not a lot) longer for the large script. 0.35s is roughly the
lower bound (for both).
Note that both test computers are sitting next to each other, on the
same unmanaged switch, talking to the same NFS share on the other side
of the country (so the hops to get there should be identical). Clearly
Windows' NFS client's performance is sub-par, but that is only to be
expected. The question is; what is Cygwin doing - that neither Linux nor
Interix do - that exasperates the problem so badly? Do I need to be
going over strace's with a fine-toothed comb?
Clearly, the 6.5+ minute time is unacceptable*. For now I'm taking the
'copy everything locally' suggestion, but I would like to know why
Cygwin performs several orders of magnitude(!) worse then either Linux
or Interix. Also note that the 'stat' behavior seems to eliminate
[f]stat() as the culprit, leaving the suspicious read()s without an
obvious trigger.
(* "unacceptable" = "clearly this won't work, and I'll have to try
something different")
Before Dave Korn decides to flame me (again), I am not asking for an
immediate fix. I am posting my findings in case anything in them jumps
out at someone else. I fully intend to track this down *myself* when I
have the time to do so.
[snip] However, I would appreciate any existing
knowledge, or even pointers to where to start poking around, that anyone
would care to share.
I think that most of the pointers and insight about what Cygwin does with
fstat are all nicely encapsulated in the source code, specifically
fhandler_disk_file.cc and fhandler.cc .
That qualifies as a pointer; thanks, I appreciate it. :-)
--
Matthew
Feed the hippo. Love the hippo. Run from the hippo.
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/