XLPE opened a new issue, #51941:
URL: https://github.com/apache/doris/issues/51941

   ### Search before asking
   
   - [x] I had searched in the 
[issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Version
   
   2.1.5+
   
   ### What's Wrong?
   
   Every time an FE node restarts, the following exception keeps appearing.
   
   ```
   2025-06-19 10:59:51,722 WARN (Manual Analysis Job Executor-1|12257) 
[StatisticsCache.sendStats():241] Failed to sync stats to follower: 
TNetworkAddress(hostname:192.168.71.14, port:9520)
   org.apache.thrift.transport.TTransportException: java.net.SocketException: 
Broken pipe
        at 
org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:216)
 ~[libthrift-0.16.0.jar:0.16.0]
        at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:73) 
~[libthrift-0.16.0.jar:0.16.0]
        at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) 
~[libthrift-0.16.0.jar:0.16.0]
        at 
org.apache.doris.thrift.FrontendService$Client.sendUpdateStatsCache(FrontendService.java:1370)
 ~[fe-common-1.2-SNAPSHOT.jar:1.2-SNAPSHOT]
        at 
org.apache.doris.thrift.FrontendService$Client.updateStatsCache(FrontendService.java:1362)
 ~[fe-common-1.2-SNAPSHOT.jar:1.2-SNAPSHOT]
        at 
org.apache.doris.statistics.StatisticsCache.sendStats(StatisticsCache.java:239) 
~[doris-fe.jar:1.2-SNAPSHOT]
        at 
org.apache.doris.statistics.StatisticsCache.syncColStats(StatisticsCache.java:229)
 ~[doris-fe.jar:1.2-SNAPSHOT]
        at 
org.apache.doris.statistics.BaseAnalysisTask.runQuery(BaseAnalysisTask.java:309)
 ~[doris-fe.jar:1.2-SNAPSHOT]
        at 
org.apache.doris.statistics.OlapAnalysisTask.doSample(OlapAnalysisTask.java:136)
 ~[doris-fe.jar:1.2-SNAPSHOT]
        at 
org.apache.doris.statistics.OlapAnalysisTask.doExecute(OlapAnalysisTask.java:96)
 ~[doris-fe.jar:1.2-SNAPSHOT]
        at 
org.apache.doris.statistics.BaseAnalysisTask.execute(BaseAnalysisTask.java:175) 
~[doris-fe.jar:1.2-SNAPSHOT]
        at 
org.apache.doris.statistics.AnalysisTaskWrapper.lambda$new$0(AnalysisTaskWrapper.java:43)
 ~[doris-fe.jar:1.2-SNAPSHOT]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
        at 
org.apache.doris.statistics.AnalysisTaskWrapper.run(AnalysisTaskWrapper.java:66)
 ~[doris-fe.jar:1.2-SNAPSHOT]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) 
~[?:?]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) 
~[?:?]
        at java.lang.Thread.run(Thread.java:833) ~[?:?]
   Caused by: java.net.SocketException: Broken pipe
        at sun.nio.ch.NioSocketImpl.implWrite(NioSocketImpl.java:420) ~[?:?]
        at sun.nio.ch.NioSocketImpl.write(NioSocketImpl.java:440) ~[?:?]
        at sun.nio.ch.NioSocketImpl$2.write(NioSocketImpl.java:826) ~[?:?]
        at java.net.Socket$SocketOutputStream.write(Socket.java:1035) ~[?:?]
        at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81) ~[?:?]
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:142) 
~[?:?]
        at 
org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:211)
 ~[libthrift-0.16.0.jar:0.16.0]
        ... 18 more
   ```
   The root cause is that the client fails to catch IO exceptions, leading to 
closed connections being returned to the pool. Additionally, the `isOpen()` 
validation method in the code relies on `isConnected()`, which cannot detect 
closed or broken connections.
   ```
   @Override
   public boolean validateObject(TNetworkAddress key, PooledObject<VALUE> p) {
       boolean isOpen = 
p.getObject().getOutputProtocol().getTransport().isOpen();
       if (LOG.isDebugEnabled()) {
           LOG.debug("isOpen={}", isOpen);
       }
       return isOpen;
   }
   ```
   
   
   
   
   
   
   ### What You Expected?
   
   no exception
   
   ### How to Reproduce?
   
   Restart the non-Master FE nodes, and then manually execute the `ANALYZE 
TABLE` command.
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to