Hi Corinna,

Am 28.02.2025 um 13:46 schrieb Corinna Vinschen via Cygwin:
Hi Rainer,

On Feb 17 20:37, Rainer Emrich via Cygwin wrote:
Am 17.02.2025 um 18:00 schrieb Corinna Vinschen via Cygwin:
On Feb 17 12:51, Rainer Emrich via Cygwin wrote:
I'm facing a strange major issue with scp. The issue exists in all cygwin 
version later than 3.5.3,
including cygwin-3.6.0-0.374.g4dd859d01c22.

If I'm copying a large file with scp I get a "connection lost" after a random 
couple of seconds:

scp -v large_file foobar:
.
.
debug1: Sending subsystem: sftp
debug1: pledge: fork
large_file                                             10%   71MB   4.3MB/s   
02:21 ETA
debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
debug1: client_input_channel_req: channel 0 rtype e...@openssh.com reply 0
debug1: channel 0: free: client-session, nchannels 1
Transferred: sent 92266460, received 35436 bytes, in 15.3 seconds
Bytes per second: sent 6035219.0, received 2317.9
debug1: Exit status 11
lost connection

In fact, I can reproduce this occassionally back to 3.5.0 and back to
OpenSSH 9.7p1.  We can't easily try this with older Cygwin versions.
It's getting increasingly hard to build older Cygwin versions due to
compiler dependencies and missing symbols.
at least for my file size, around 700MB, I can't reproduce this with cygwin
3.5.3. I noticed this issue for the first time in the autumn last year.

What that means in the first place, is that this is neither a regression
from 3.5.7, nor even from 3.5.1.  Obviously I can't prove if this has
been introduced into 3.5.0, but I'd like to point out that we didn't
have any noticable change in the socket code for almost 4 years, back
during 3.3 development.

Fun fact: I can NOT reproduce the problem when using the -O option,
i. e., when using the old scp protocol.  The old protocol isn't
slower either.

Maybe that's a workaround for you?

I try this, thanks.

I'm debugging this problem on and off for the last couple of days,
and even discussing it with one of the upstream OpenSSH maintainers.

But it's still a mystery to me.  The "lost connection" message does not
really point to the cause of the problem, it's just a followup effect:

The server receives an EPIPE on the read socket, which in turn
results in the clientside ssh to receive an "end-of-write" packet from
the server, which in turn results in ssh closing the pipe to scp, which
in turn prints the "lost connection" message.

The only thing I can say so far is that it appears to be signal related.

Fact is, that scp usually runs a SIGALRM triggered progressmeter.  If
you disable the progressmeter by running scp with the -q option, you
can avoid the "lost connection" as well, you don't have to ron scp -O.

thank you for the info, that's good to know.

The strange thing, if I use strace to debug this, the cpoy succeeds:
strace -efno strace.log scp -v large_file foobar:

This often points to a timing issue, but beats me where that could be.

I would try to debug this further, if I had an idea how to do that.

Same here ATM, sorry.

That's really strange.

Yeah, I know.  But it's really tricky.  All my debugging so far only
turned up followup effects, not the actual cause.  Sigh.

Sometimes it's really hard find the cause of a failure. I wish you good luck. Anyway, thank you for all your work on ycgwin.

Rainer

Attachment: OpenPGP_0x917D882CE22A6AD2_and_old_rev.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

Reply via email to