On Tue, 09 Apr 2019 at 17:26:22 +0200, gregor herrmann wrote: > On Tue, 09 Apr 2019 17:14:32 +0200, Guilhem Moulin wrote: >> With TLS 1.3? (You can pass ‘SSL_version => "TLSv1_3"’ to ssl_opts to >> force this.) Doesn't work here, still hangs on read(): > > Yes, also with using TLSv1_3 explicitly: > […] > (trace attached in case it helps)
AFAICT this worked this time because the socket was *only* marked as ready for writing after the first select() call. Only during the second call was there some data to be read: > select(8, [3], [3], NULL, {tv_sec=180, tv_usec=0}) = 1 (out [3], left > {tv_sec=179, tv_usec=999996}) > select(8, [3], NULL, NULL, {tv_sec=180, tv_usec=0}) = 1 (in [3], left > {tv_sec=179, tv_usec=977469}) I'm unable to reproduce this with v1.3, probably due to race conditions. Anyway I fail to see how the patch can help, because as I wrote in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914034#101 the socket is in blocking mode (hence SSL_MODE_AUTO_RETRY is set) by the time LWP starts its select loop, and SSL_MODE_AUTO_RETRY is set. This is visible by adding fcntl(2) to the traced set of system calls: $ strace -etrace=fcntl,select,read perl -MLWP::UserAgent -MIO::Socket::SSL -e \ '$IO::Socket::SSL::DEBUG = 3; LWP::UserAgent->new(ssl_opts => {SSL_version => "TLSv1_3"})->post("https://facebook.com", { data => "" })' […] fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 DEBUG: .../IO/Socket/SSL.pm:831: set socket to non-blocking to enforce timeout=180 DEBUG: .../IO/Socket/SSL.pm:844: call Net::SSLeay::connect read(3, 0x5628bec16923, 5) = -1 EAGAIN (Resource temporarily unavailable) DEBUG: .../IO/Socket/SSL.pm:847: done Net::SSLeay::connect -> -1 DEBUG: .../IO/Socket/SSL.pm:857: ssl handshake in progress DEBUG: .../IO/Socket/SSL.pm:867: waiting for fd to become ready: SSL wants a read first select(8, [3], NULL, NULL, {tv_sec=180, tv_usec=0}) = 1 (in [3], left {tv_sec=179, tv_usec=988296}) DEBUG: .../IO/Socket/SSL.pm:887: socket ready, retrying connect DEBUG: .../IO/Socket/SSL.pm:844: call Net::SSLeay::connect […] DEBUG: .../IO/Socket/SSL.pm:847: done Net::SSLeay::connect -> 1 DEBUG: .../IO/Socket/SSL.pm:902: ssl handshake done fcntl(3, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK) fcntl(3, F_SETFL, O_RDWR) = 0 […] select(8, [3], [3], NULL, {tv_sec=180, tv_usec=0}) = 2 (in [3], out [3], left {tv_sec=179, tv_usec=999998}) read(3, "…", 5) = 5 read(3, "…", 156) = 156 read(3, When the non-application record comes in, the socket is marked as ready for reading, but SSL_read() transparently processes the non-application data record, and blocks on trying to read an application data record. If one is lucky and the socket is *only* marked as ready for writing (ie not for reading as well, like in your trace) when select() returns then the problem doesn't trigger (at least not right after the handshake — OTOH it might occur later on renegotiation), but AFAICT it's orthogonal to whether the patch is applied or not: we use blocking I/O, so SSL_MODE_AUTO_RETRY is set just like before (`Net::SSLeay::set_mode($ssl, $mode_auto_retry)` is called just before clearing O_NONBLOCK). If the (blocking) socket is marked for reading when select() returns, then the application assumes that SSL_read() won't block, and setting SSL_MODE_AUTO_RETRY breaks that assumption, as written in the OpenSSL changelog. Instead of a blocking SSL_read() the application expects it to return SSL_ERROR_WANT_READ. And proceeds with SSL_write() if the socket is also ready for writing, like in the trace above. -- Guilhem.
signature.asc
Description: PGP signature