This is an automated email from the ASF dual-hosted git repository.

luwei16 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
     new 0c313b391d2 [fix](brpc) disable SSL BIO buffer  (#64962)
0c313b391d2 is described below

commit 0c313b391d20232b777bbe741039b69f824073a6
Author: Luwei <[email protected]>
AuthorDate: Fri Jul 3 12:14:45 2026 +0800

    [fix](brpc) disable SSL BIO buffer  (#64962)
    
    Problem Summary:
    
        Doris uses brpc 1.4.0 in thirdparty. When mTLS is enabled, brpc adds
        an extra buffered BIO layer after SSL handshake. Large meta-service
        get_rowset responses can expose a TLS write issue in this path:
    SSL_write() may consume plaintext successfully, but the later
    BIO_flush()
        can hit non-fatal EAGAIN while encrypted bytes are still buffered.
    
    brpc 1.4.0 does not surface that flush EAGAIN as SSL_ERROR_WANT_WRITE to
    the outer KeepWrite/EPOLLOUT retry path. The server side may therefore
        treat the write as completed while BE receives an incomplete brpc
        response frame and eventually times out.
    
        This is hard to reproduce locally because it depends on socket
        backpressure during the buffered BIO flush, not just response size.
    Production Service/VIP/CNI/node load/conntrack/send queues can make this
        timing window easier to hit.
    
    Backport the upstream brpc approach by disabling AddBIOBuffer(...) after
    SSL handshake. Without this buffered BIO layer, SSL_write() can surface
    SSL_ERROR_WANT_WRITE directly to brpc's existing KeepWrite/EPOLLOUT
    retry
    mechanism. Increasing timeout is not a real fix because the server side
        may already have misjudged the write as complete.
    
        ### Release note
    
        Fix possible meta-service get_rowset timeout with mTLS when brpc TLS
        buffered BIO fails to flush all encrypted bytes under socket
        backpressure.
    
        ### Check List (For Author)
    
        - Test: No need to test. Thirdparty brpc patch only; the failure
          depends on production socket backpressure timing.
    - Behavior changed: Yes. TLS connections no longer use brpc's extra SSL
          buffered BIO layer so WANT_WRITE is handled by the existing retry
          mechanism.
        - Does this need documentation: No
---
 .../patches/brpc-1.4.0-disable-ssl-bio-buffer.patch   | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/thirdparty/patches/brpc-1.4.0-disable-ssl-bio-buffer.patch 
b/thirdparty/patches/brpc-1.4.0-disable-ssl-bio-buffer.patch
new file mode 100644
index 00000000000..b8a3976853b
--- /dev/null
+++ b/thirdparty/patches/brpc-1.4.0-disable-ssl-bio-buffer.patch
@@ -0,0 +1,19 @@
+diff --git a/src/brpc/socket.cpp b/src/brpc/socket.cpp
+index e3878c19..d0dd4f4d 100644
+--- a/src/brpc/socket.cpp
++++ b/src/brpc/socket.cpp
+@@ -1889,7 +1889,13 @@ int Socket::SSLHandshake(int fd, bool server_mode) {
+         int rc = SSL_do_handshake(_ssl_session);
+         if (rc == 1) {
+             _ssl_state = SSL_CONNECTED;
+-            AddBIOBuffer(_ssl_session, fd, FLAGS_ssl_bio_buffer_size);
++            // Do not add BIO_f_buffer on SSL connections.
++            //
++            // brpc-1.4.0 flushes this buffered BIO after SSL_write(), but a
++            // non-fatal EAGAIN from BIO_flush() is not propagated as
++            // SSL_ERROR_WANT_WRITE. Large TLS responses may then be 
considered
++            // written while encrypted bytes are still buffered and never 
retried.
++            // Let OpenSSL's socket BIO surface WANT_WRITE to KeepWrite 
instead.
+             return 0;
+         }
+ 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to