Hi dear,
I encountered a weird problem: http requests got timeout when the QPS is high,
even if my server has processed the requests and responsed immediately.
(It might be a little long, your patience is appreciated)
1. Below is taken from the log file, wheras
ApacheAsyncHttpRequestExecutor is an encapsulation of HttpAsyncRequestExecutor
and adds more logs, as show below:
2019-05-29 14:33:30,933 INFO Apache-NHttp-1-23 ApacheAsyncHttpRequestExecutor
- requestReady: http-outgoing-43798 [ACTIVE], /api/v1/newOrder,
6445b9465cee27ba81fb02280d61cd99
2019-05-29 14:33:30,934 INFO Apache-NHttp-1-23 ApacheAsyncHttpRequestExecutor
- conn submitRequest: http-outgoing-43798 [ACTIVE], /api/v1/newOrder,
6445b9465cee27ba81fb02280d61cd99
2019-05-29 14:33:30,934 INFO Apache-NHttp-1-23 ApacheAsyncHttpRequestExecutor
- requestReady: http-outgoing-43798 [ACTIVE], /api/v1/newOrder,
6445b9465cee27ba81fb02280d61cd99
2019-05-29 14:33:31,178 INFO Apache-NHttp-1-23 ApacheAsyncHttpRequestExecutor
- responseReceived: http-outgoing-43798 [ACTIVE(1850)], /api/v1/newOrder,
6445b9465cee27ba81fb02280d61cd99, 200, 244
2019-05-29 14:33:36,224 ERROR Apache-NHttp-1-23 HttpMgr - Receive error (发单,
408, 6445b9465cee27ba81fb02280d61cd99): 5,000 milliseconds timeout on
connection http-outgoing-43798 [ACTIVE]
// comments: The above several lines belong to the same http request as they
have the same trace id 6445b9465cee27ba81fb02280d61cd99, trace id is per
request and is unique within all backend servers
// comments: Response already received at 14:33:31,178, but the request failed
at 14:33:36,224 with an error status 408 and error msg “5,000 ms timeout on
connection…”
// comments: Beside the logs above, I’m quite sure that my backend server
responsed quickly(~100ms, so it’s impossible to timeout for 5000ms) as I
checked server side logs with the unique trace id.
// comments: This only happens when the http client runs at a high QPS like
20000 on my side, then more and more 408 errors happen. At this time my
physical machine runs at a low CPU/memory usage.
2. So I turned on debug level logs of apache asynchttpclient, and below
are some more logs that could be helpful:
2019-05-29 14:33:30,930 DEBUG httpmgr-4-422 InternalHttpAsyncClient -
[exchange: 679621] Connection allocated: CPoolProxy{http-outgoing-43798
[ACTIVE]}
2019-05-29 14:33:30,930 DEBUG httpmgr-4-422 ManagedNHttpClientConnectionImpl -
http-outgoing-43798 100.2.2.2:25905<->100.3.3.3:80[ACTIVE][r:r]: Set attribute
http.nio.exchange-handler
2019-05-29 14:33:30,930 DEBUG httpmgr-4-422 ManagedNHttpClientConnectionImpl -
http-outgoing-43798 100.2.2.2:25905<->100.3.3.3:80[ACTIVE][rw:r]: Event set [w]
2019-05-29 14:33:30,933 DEBUG Apache-NHttp-1-23 InternalIODispatch -
http-outgoing-43798 [ACTIVE] Request ready
2019-05-29 14:33:30,933 DEBUG Apache-NHttp-1-23
ManagedNHttpClientConnectionImpl - http-outgoing-43798
100.2.2.2:25905<->100.3.3.3:80[ACTIVE][rw:w]: Set timeout 5000
2019-05-29 14:33:30,934 DEBUG Apache-NHttp-1-23
ManagedNHttpClientConnectionImpl - http-outgoing-43798
100.2.2.2:25905<->100.3.3.3:80[ACTIVE][rw:w]: Event set [w]
2019-05-29 14:33:30,934 DEBUG Apache-NHttp-1-23 InternalIODispatch -
http-outgoing-43798 [ACTIVE] Output ready
2019-05-29 14:33:30,934 DEBUG Apache-NHttp-1-23 InternalIODispatch -
http-outgoing-43798 [ACTIVE] [content length: 1813; pos: 1813; completed: true]
2019-05-29 14:33:30,934 DEBUG Apache-NHttp-1-23
ManagedNHttpClientConnectionImpl - http-outgoing-43798
100.2.2.2:25905<->100.3.3.3:80[ACTIVE][rw:w]: 2287 bytes written
2019-05-29 14:33:30,934 DEBUG Apache-NHttp-1-23 InternalIODispatch -
http-outgoing-43798 [ACTIVE] Request ready
2019-05-29 14:33:30,934 DEBUG Apache-NHttp-1-23
ManagedNHttpClientConnectionImpl - http-outgoing-43798
100.2.2.2:25905<->100.3.3.3:80[ACTIVE][r:w]: Event cleared [w]
2019-05-29 14:33:31,178 DEBUG Apache-NHttp-1-23 AbstractIODispatch -
http-outgoing-43798 [ACTIVE] Input ready
2019-05-29 14:33:31,178 DEBUG Apache-NHttp-1-23
ManagedNHttpClientConnectionImpl - http-outgoing-43798
100.2.2.2:25905<->100.3.3.3:80[ACTIVE][r:r]: 2333 bytes read
2019-05-29 14:33:31,178 DEBUG Apache-NHttp-1-23 InternalIODispatch -
http-outgoing-43798 [ACTIVE(1850)] Response received
2019-05-29 14:33:31,178 DEBUG Apache-NHttp-1-23 InternalIODispatch -
http-outgoing-43798 [ACTIVE(1850)] Input ready
2019-05-29 14:33:31,178 DEBUG Apache-NHttp-1-23
ManagedNHttpClientConnectionImpl - http-outgoing-43798
100.2.2.2:25905<->100.3.3.3:80[ACTIVE][r:r]: 0 bytes read
2019-05-29 14:33:31,178 DEBUG Apache-NHttp-1-23
ManagedNHttpClientConnectionImpl - http-outgoing-43798
100.2.2.2:25905<->100.3.3.3:80[ACTIVE][r:r]: 0 bytes read
2019-05-29 14:33:31,178 DEBUG Apache-NHttp-1-23
ManagedNHttpClientConnectionImpl - http-outgoing-43798
100.2.2.2:25905<->100.3.3.3:80[ACTIVE][r:r]: 0 bytes read
2019-05-29 14:33:31,178 DEBUG Apache-NHttp-1-23 InternalIODispatch -
http-outgoing-43798 [ACTIVE] [chunk-coded; completed: false]
2019-05-29 14:33:36,223 DEBUG Apache-NHttp-1-23 InternalIODispatch -
http-outgoing-43798 [ACTIVE] Timeout
2019-05-29 14:33:36,224 DEBUG Apache-NHttp-1-23
ManagedNHttpClientConnectionImpl - http-outgoing-43798
100.2.2.2:25905<->100.3.3.3:80[ACTIVE][r:r]: Shutdown
2019-05-29 14:33:36,224 DEBUG Apache-NHttp-1-23
PoolingNHttpClientConnectionManager - Releasing connection: [id:
http-outgoing-43798][route: {}->http://api.udache.com:80][total kept alive: 16;
route allocated: 304 of 20000; total allocated: 328 of 25000]
2019-05-29 14:33:36,224 DEBUG Apache-NHttp-1-23
PoolingNHttpClientConnectionManager - Connection released: [id:
http-outgoing-43798][route: {}->http://api.udache.com:80][total kept alive: 14;
route allocated: 303 of 20000; total allocated: 327 of 25000]
2019-05-29 14:33:36,225 DEBUG Apache-NHttp-1-23
ManagedNHttpClientConnectionImpl - http-outgoing-43798
0.0.0.0:25905<->100.3.3.3:80[CLOSED][]: Shutdown
2019-05-29 14:33:36,225 DEBUG Apache-NHttp-1-23 InternalIODispatch -
http-outgoing-43798 [CLOSED]: Disconnected
3. Environment:
Machine that runs http client is within the same IDC as machine that runs server
CentOS release 6.7 (Final), Linux version 2.6.32-573.18.1.el6.toav2.x86_64
Java 1.8.0_101;
Apache asynchtppclient 4.1.4;
Theoretically http client should receive reponse correctly without timeout
errors, it’s difficult to explain this issue. I’ve investigated source code of
asynchttpclient for nearly 1 day, but got no clues so far. Could anyone help me
on this?
Hope I make the question clear, and any reply is high appreciated.
Regards
xiaoqing