Hi
On 16/04/2018 12:04, Igor Cicimov wrote: > > > On Mon, 16 Apr 2018 6:09 pm Ayush Goyal <[email protected] > <mailto:[email protected]>> wrote: > > Hi Moemen, > > Thanks for your response. But I think I need to clarify a few > things here. > > On Mon, Apr 16, 2018 at 4:33 AM Moemen MHEDHBI > <[email protected] <mailto:[email protected]>> wrote: > > Hi > > > On 12/04/2018 19:16, Ayush Goyal wrote: >> Hi, >> >> I have a question regarding haproxy backend connection >> behaviour. We have following setup: >> >> +---------+ +-------+ >> | haproxy |---->| nginx | >> +---------+ +-------+ >> >> We use a haproxy cluster for ssl off-loading and then load >> balance request to >> nginx cluster. We are currently benchmarking this setup with >> 3 nodes for haproxy >> cluster and 1 nginx node. Each haproxy node has two >> frontend/backend pair. First >> frontend is a router for ssl connection which redistributes >> request to the second >> frontend in the haproxy cluster. The second frontend is for >> ssl handshake and >> routing requests to nginx servers. Our configuration is as >> follows: >> >> ``` >> global >> maxconn 100000 >> user haproxy >> group haproxy >> nbproc 2 >> cpu-map 1 1 >> cpu-map 2 2 >> >> defaults >> mode http >> option forwardfor >> timeout connect 5s >> timeout client 30s >> timeout server 30s >> timeout tunnel 30m >> timeout client-fin 5s >> >> frontend ssl_sess_id_router >> bind *:443 >> bind-process 1 >> mode tcp >> maxconn 100000 >> log global >> option tcp-smart-accept >> option splice-request >> option splice-response >> default_backend ssl_sess_id_router_backend >> >> backend ssl_sess_id_router_backend >> bind-process 1 >> mode tcp >> fullconn 50000 >> balance roundrobin >> ...<ssl_stickiness_config>... >> option tcp-smart-connect >> server lbtest01 <ip1>:8443 weight 1 check send-proxy >> server lbtest02 <ip2>:8443 weight 1 check send-proxy >> server lbtest03 <ip3>:8443 weight 1 check send-proxy >> >> frontend nginx_ssl_fe >> bind *:8443 ssl <ssl_options> >> maxconn 100000 >> bind-process 2 >> option tcp-smart-accept >> option splice-request >> option splice-response >> option forwardfor >> reqadd X-Forwarded-Proto:\ https >> timeout client-fin 5s >> timeout http-request 8s >> timeout http-keep-alive 30s >> default_backend nginx_backend >> >> backend nginx_backend >> bind-process 2 >> balance roundrobin >> http-reuse safe >> option tcp-smart-connect >> option splice-request >> option splice-response >> timeout tunnel 30m >> timeout http-request 8s >> timeout http-keep-alive 30s >> server testnginx <ip>:80 weight 1 check >> ``` >> >> The nginx node has nginx with 4 workers and 8192 max clients, >> therefore the max >> number of connection it can accept is 32768. >> >> For benchmark, we are generating ~3k new connections per >> second where each >> connection makes 1 http request and then holds the connection >> for next 30 >> seconds. This results in a high established connection on the >> first frontend, >> ssl_sess_id_router, ~25k per haproxy node (Total ~77k >> connections on 3 haproxy >> nodes). The second frontend (nginx_ssl_fe) receives the same >> number of >> connection on the frontend. On nginx node, we see that active >> connections >> increase to ~32k. >> >> Our understanding is that haproxy should keep a 1:1 >> connection mapping for each >> new connection in frontend/backend. But there is a connection >> count mismatch >> between haproxy and nginx (Total 77k connections in all 3 >> haproxy for both >> frontends vs 32k connections in nginx made by nginx_backend), >> We are still not >> facing any major 5xx or connection errors. We are assuming >> that this is >> happening because haproxy is terminating old idle ssl >> connections to serve the >> new ones. We have following questions: >> >> 1. How the nginx_backend connections are being terminated to >> serve the new >> connections? > Connections are usually terminated when the client receives > the whole response. Closing the connection can be initiated by > the client, server of HAProxy (timeouts, etc..) > > > Client connections are keep-alive here for 30 seconds from client > side. Various timeout values in both nginx and haproxy are > sufficiently high of the order of 60 seconds. Still what we are > observing here is that nginx is closing the connection after 7-14 > seconds to serve new client requests. Not sure why nginx or > haproxy will close existing keep-alive connections to serve new > requests when timeouts are sufficiently high? > A keep-alive connection may be closed by the client or the server with the "Connection: close" header. Or the connection may be closed because of timeouts. A traffic capture will show what can be the cause here. >> 2. Why haproxy is not terminating connections on the frontend >> to keep it them at 32k >> for 1:1 mapping? > I think there is no 1:1 mapping between the number of > connections in haproxy and nginx. This is because you are > chaining the two fron/back pairs in haproxy, so when the > client establishes 1 connctions with haproxy you will see 2 > established connections in haproxy stats. This explains why > the number of connections in haproxy is the double of the ones > in nginx. > > > I want to clarify about the number of connections here in each > frontend. We are observing 77k connections in the first frontend > stats i.e ssl_sess_id_router, initiated by client. Then we are > observing another set of 77k connections in the nginx_ssl_fe > frontend stats initiated by the backend of the first frontend. But > corresponding connections in the backend for second frontend are > much fewer, around 32k. This is not same as your explanation. The > connection in nginx_ssl_fe frontend stats are more than double of > what nginx can handle. Question is when nginx can just do 32k > connections, how can the nginx_ssl_fe frontend accept 77k connections? > > > Seems you are forgeting haproxy does queuing instead of dropping the > frontend connections. Maybe samples of your log can be useful to see > the requests timings. > > > -- > Ayush Goyal > As suggested by Igor, samples of your log will give us better idea of the number of active and queued connections here. Also if you can send a screenshots of the stats page or the output of 'show stat' command in the CLI (for both sockets, I suppose that you use a stat socket per process), it will be easier to diagnose this. -- Moemen MHEDHBI

