This is an automated email from the ASF dual-hosted git repository.
luwei16 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git
The following commit(s) were added to refs/heads/master by this push:
new 0c313b391d2 [fix](brpc) disable SSL BIO buffer (#64962)
0c313b391d2 is described below
commit 0c313b391d20232b777bbe741039b69f824073a6
Author: Luwei <[email protected]>
AuthorDate: Fri Jul 3 12:14:45 2026 +0800
[fix](brpc) disable SSL BIO buffer (#64962)
Problem Summary:
Doris uses brpc 1.4.0 in thirdparty. When mTLS is enabled, brpc adds
an extra buffered BIO layer after SSL handshake. Large meta-service
get_rowset responses can expose a TLS write issue in this path:
SSL_write() may consume plaintext successfully, but the later
BIO_flush()
can hit non-fatal EAGAIN while encrypted bytes are still buffered.
brpc 1.4.0 does not surface that flush EAGAIN as SSL_ERROR_WANT_WRITE to
the outer KeepWrite/EPOLLOUT retry path. The server side may therefore
treat the write as completed while BE receives an incomplete brpc
response frame and eventually times out.
This is hard to reproduce locally because it depends on socket
backpressure during the buffered BIO flush, not just response size.
Production Service/VIP/CNI/node load/conntrack/send queues can make this
timing window easier to hit.
Backport the upstream brpc approach by disabling AddBIOBuffer(...) after
SSL handshake. Without this buffered BIO layer, SSL_write() can surface
SSL_ERROR_WANT_WRITE directly to brpc's existing KeepWrite/EPOLLOUT
retry
mechanism. Increasing timeout is not a real fix because the server side
may already have misjudged the write as complete.
### Release note
Fix possible meta-service get_rowset timeout with mTLS when brpc TLS
buffered BIO fails to flush all encrypted bytes under socket
backpressure.
### Check List (For Author)
- Test: No need to test. Thirdparty brpc patch only; the failure
depends on production socket backpressure timing.
- Behavior changed: Yes. TLS connections no longer use brpc's extra SSL
buffered BIO layer so WANT_WRITE is handled by the existing retry
mechanism.
- Does this need documentation: No
---
.../patches/brpc-1.4.0-disable-ssl-bio-buffer.patch | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/thirdparty/patches/brpc-1.4.0-disable-ssl-bio-buffer.patch
b/thirdparty/patches/brpc-1.4.0-disable-ssl-bio-buffer.patch
new file mode 100644
index 00000000000..b8a3976853b
--- /dev/null
+++ b/thirdparty/patches/brpc-1.4.0-disable-ssl-bio-buffer.patch
@@ -0,0 +1,19 @@
+diff --git a/src/brpc/socket.cpp b/src/brpc/socket.cpp
+index e3878c19..d0dd4f4d 100644
+--- a/src/brpc/socket.cpp
++++ b/src/brpc/socket.cpp
+@@ -1889,7 +1889,13 @@ int Socket::SSLHandshake(int fd, bool server_mode) {
+ int rc = SSL_do_handshake(_ssl_session);
+ if (rc == 1) {
+ _ssl_state = SSL_CONNECTED;
+- AddBIOBuffer(_ssl_session, fd, FLAGS_ssl_bio_buffer_size);
++ // Do not add BIO_f_buffer on SSL connections.
++ //
++ // brpc-1.4.0 flushes this buffered BIO after SSL_write(), but a
++ // non-fatal EAGAIN from BIO_flush() is not propagated as
++ // SSL_ERROR_WANT_WRITE. Large TLS responses may then be
considered
++ // written while encrypted bytes are still buffered and never
retried.
++ // Let OpenSSL's socket BIO surface WANT_WRITE to KeepWrite
instead.
+ return 0;
+ }
+
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]