Hi, When working on getting rid of ImmediateInterruptOK I wanted to verify that ssl still works correctly. Turned out it didn't. But neither did it in master.
Turns out there's two major things we do wrong: 1) We ignore the rule that once called and returning SSL_ERROR_WANTS_(READ|WRITE) SSL_read()/write() have to be called again with the same parameters. Unfortunately that rule doesn't mean just that the same parameters have to be passed in, but also that we can't just constantly switch between _read()/write(). Especially nonblocking backend code (i.e. walsender) and the whole frontend code violate this rule. 2) We start renegotiations in be_tls_write() while in nonblocking mode, but don't properly retry to handle socket readyness. There's a loop that retries handshakes twenty times (???), but what actually is needed is to call SSL_get_error() and ensure that there's actually data available. 2) is easy enough to fix, but 1) is pretty hard. Before anybody says that 1) isn't an important rule: It reliably causes connection aborts within a couple renegotiations. The higher the latency the higher the likelihood of aborts. Even locally it doesn't take very long to abort. Errors usually are something like "SSL connection has been closed unexpectedly" or "SSL Error: sslv3 alert unexpected message" and a host of other similar messages. There's a couple reports of those in the archives and I've seen many more in client logs. As far as I can see the only realistic way to fix 1) is to change both frontend and backend code to: a) Always check for socket read/writeability before calling SSL_read/write() when in nonblocking mode. That's a bit annoying because it nearly doubles the amount of syscalls we do or client communication, but I can't really se an alternative. That allows us to avoid waiting inside after a WANT_READ/WRITE, or havin to setup a larger state machine that keeps track what we tried last. b) When SSL_read/write nonetheless returns WANT_READ/WRITE, even though we tested for read/writeability, we're very likely doing renegotiation. In that case we'll just have to block. There's already code that busy loops (and thus waits) in the frontend (c.f. pgtls_read's WANT_WRITE case, triggered during reneg). We can't just return immediately to the upper layers as we'd otherwise likely violate the rule about calling ssl with the same parameters again. c) Add a somewhat hacky optimization whereas we allow to break out of a WANT_READ condition in a nonblocking socket when ssl->state == SSL_ST_OK. That's the cases where it actually, at least by my reading of the unreadable ssl code, safe to not wait. That case is somewhat important because we otherwise can end up waiting on both sides due to b), even when nonblocking calls where actually made. That condition essentially means that we'll only block if renegotiation or partial reads are in progress. Afaics at least. d) Remove the SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER hack - we don't actually need it anymore. These errors are much less frequent when using a plain frontend (e.g. psql/pgbench) because they don't use copy both stuff - the way these clients use the FE/BE protocol there's essentially natural synchronization points where nothing but renegotiation happens. With walsender (or pipelined queries!) both sides can write at the same time. My testcase for this is just to setup a server with a low ssl_renegotiation_limit, generate lots of WAL (wal.sql attached) and receive data via pg_receivexlog -n. Usually it'll error out quickly. I've done a preliminary implementation of the above steps and it survives transferring 25GB of WAL via the replication protocol with a ssl_renegotiation_limit=100kB - previously it failed much earlier. Does anybody have a neater way to tackle this? I'm not happy about this solution, but I really can't think of anything better (save ditching openssl maybe). I'm willing to clean up my hacked up fix for this, but not if we can't find agreement on the approach. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers