On Tue, 09 Apr 2019 at 07:48:45 +0200, Steffen Ullrich wrote: > Simply clearing SSL_MODE_AUTO_RETRY will cause problems with blocking > connections in TLS 1.3.
AFAICT not when SSL_read() is used as documented. Also while the issue is triggered more often for TLS 1.3 than for earlier TLS protocol versions, it's not specific to TLS 1.3: “TLSv1.3 sends more non-application data records after the handshake is finished. At least the session ticket and possibly a key update is send after the finished message. With TLSv1.2 it happened in case of renegotiation. SSL_read() has always documented that it can return SSL_ERROR_WANT_READ after processing non-application data, even when there is still data that can be read. When SSL_MODE_AUTO_RETRY is set using SSL_CTX_set_mode() OpenSSL will try to process the next record, and so not return SSL_ERROR_WANT_READ while it still has data available. Because many applications did not handle this properly, SSL_MODE_AUTO_RETRY has been made the default. If the application is using blocking sockets and SSL_MODE_AUTO_RETRY is enabled, and select() is used to check if a socket is readable this results in SSL_read() processing the non-application data records, but then try to read an application data record which might not be available and hang.” — https://wiki.openssl.org/index.php/TLS1.3#Non-application_data_records FWIW OpenSSL 1.1.1a's changelog does mention that the new default causes regressions: “SSL_MODE_AUTO_RETRY is enabled by default. Applications that use blocking I/O in combination with something like select() or poll() will hang. This can be turned off again using SSL_CTX_clear_mode(). Many applications do not properly handle non-application data records, and TLS 1.3 sends more of such records. Setting SSL_MODE_AUTO_RETRY works around the problems in those applications, but can also break some. It's recommended to read the manpages about SSL_read(), SSL_write(), SSL_get_error(), SSL_shutdown(), SSL_CTX_set_mode() and SSL_CTX_set_read_ahead() again.” — https://github.com/openssl/openssl/blob/OpenSSL_1_1_1a/CHANGES#L153 Programs that *were* broken (would have choked on renegotation with TLS <1.3, or on key update / session ticket with TLS 1.3) might work better now, but it's *really* unfortunate that programs like LWP::Protocol::http, with a properly written select(2) loop (ie able to work around SSL_ERROR_WANT_{READ,WRITE}), are now broken. > Please check if > https://github.com/noxxi/p5-io-socket-ssl/commit/09bc6a3203bc7bc89078317da42a3e96cdbf94fc > fixes the problems you see. It doesn't, as the socket is in blocking mode when it enters the select loop. As the OpenSSL's changelog puts it, “Applications that use blocking I/O in combination with something like select() or poll() will hang”. I guess a better fix is to not to change the OpenSSL default in IO::Socket::SSL but make it configurable with a new option ‘SSL_auto_retry’; and set that option to 0 in applications with select loops. AFAICT the alternative would be to refactor all these loops, so clearly a much bigger task. This is not specific to IO::Socket::SSL, also. Any program with such select/poll loops, written in any language, needs either refactoring or SSL_MODE_AUTO_RETRY be cleared. -- Guilhem.
signature.asc
Description: PGP signature