[ I'm using HaProxy for 5 years but with static conf reloaded]

Due to issues with speed of discovery for backends through DNS on AWS, I'm writing my own system to insert servers on the fly in my load balancers.

As names for backend, I'm using an taskid from my cloud provider

backend pages

        timeout server 120s
        option forwardfor
        http-request redirect scheme https if ! { ssl_fc }
        option httpchk GET /health.php
        default-server inter 5s fall 3 rise 2
        balance random
        server bdb47d1ac9644c5f99c5e90dd4f9b944 172.31.35.239:80 weight 10 maxconn 16 check slowstart 10s

With a config build from cluster status, everything is fine (16:33:58)

When AWS/ECS send me a new task, I register it with those 3 commands (at 16:34:27):

echo "add server pages/bdb47d1ac9644c5f99c5e90dd4f9b944 172.31.35.239:80 weight 10 maxconn 32 check inter 5s fall 3 rise 2 slowstart 10s " |netcat -w 2 172.31.33.146 9999
New server registered.

echo "enable health pages/bdb47d1ac9644c5f99c5e90dd4f9b944" |netcat -w 2 172.31.33.146 9999

echo "enable server pages/bdb47d1ac9644c5f99c5e90dd4f9b944" |netcat -w 2 172.31.33.146 9999


As you can see in the logs, servers are seen, registered and marked as UP. But a request made a few seconds later, the backend can't find a suitable server to fulfill the request.


Feb  7 16:33:29 ip-172-31-33-146 haproxy[42439]: [NOTICE]   (42439) : haproxy version is 2.7.2-1ppa1~jammy Feb  7 16:33:29 ip-172-31-33-146 haproxy[42439]: [NOTICE] (42439) : path to executable is /usr/sbin/haproxy Feb  7 16:33:29 ip-172-31-33-146 haproxy[42439]: [NOTICE] (42439) : New worker (42442) forked Feb  7 16:33:29 ip-172-31-33-146 haproxy[42439]: [NOTICE] (42439) : Loading success. Feb  7 16:33:58 ip-172-31-33-146 haproxy[42442]: 82.66.114.242:57352 [07/Feb/2023:16:33:57.712] www~ pages/bdb47d1ac9644c5f99c5e90dd4f9b944 0/0/0/1131/1141 200 67569 - - ---- 1/1/0/0/0 0/0 "GET / HTTP/1.1" www.XXXXXXXX Wget/1.20.3 (linux-gnu) Feb  7 16:34:15 ip-172-31-33-146 haproxy[42442]: [WARNING] (42442) : Server pages/bdb47d1ac9644c5f99c5e90dd4f9b944 is going DOWN for maintenance. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. Feb  7 16:34:15 ip-172-31-33-146 haproxy[42442]: [ALERT] (42442) : backend 'pages' has no server available! Feb  7 16:34:15 ip-172-31-33-146 haproxy[42442]: Server pages/bdb47d1ac9644c5f99c5e90dd4f9b944 is going DOWN for maintenance. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. Feb  7 16:34:15 ip-172-31-33-146 haproxy[42442]: backend pages has no server available! Feb  7 16:34:20 ip-172-31-33-146 haproxy[42442]: [NOTICE] (42442) : Server deleted. Feb  7 16:34:27 ip-172-31-33-146 haproxy[42442]: [NOTICE] (42442) : CLI : 'server pages/bdb47d1ac9644c5f99c5e90dd4f9b944' : New server registered. Feb  7 16:34:40 ip-172-31-33-146 haproxy[42442]: [WARNING] (42442) : Server pages/bdb47d1ac9644c5f99c5e90dd4f9b944 is UP/READY (leaving forced maintenance). Feb  7 16:34:40 ip-172-31-33-146 haproxy[42442]: Server pages/bdb47d1ac9644c5f99c5e90dd4f9b944 is UP/READY (leaving forced maintenance). Feb  7 16:34:50 ip-172-31-33-146 haproxy[42442]: [WARNING] (42442) : Server pages/bdb47d1ac9644c5f99c5e90dd4f9b944 is UP. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue. Feb  7 16:34:50 ip-172-31-33-146 haproxy[42442]: Server pages/bdb47d1ac9644c5f99c5e90dd4f9b944 is UP. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue. Feb  7 16:35:16 ip-172-31-33-146 haproxy[42442]: 82.66.114.242:36698 [07/Feb/2023:16:35:01.250] www~ pages/<NOSRV> 0/15001/-1/-1/15001 503 4793 - - sQ-- 1/1/0/0/0 0/1 "GET / HTTP/1.1" www.XXXXXXX Wget/1.20.3 (linux-gnu)


The servers state is like this:

echo "show servers state pages" |netcat -w 2 172.31.33.146 9999
1
# be_id be_name srv_id srv_name srv_addr srv_op_state srv_admin_state srv_uweight srv_iweight srv_time_since_last_change srv_check_status srv_check_result srv_check_health srv_check_state srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord srv_use_ssl srv_check_port srv_check_addr srv_agent_addr srv_agent_port 5 pages 1 bdb47d1ac9644c5f99c5e90dd4f9b944 172.31.35.239 2 0 10 10 2087 15 3 4 6 0 0 0 - 80 - 0 0 - - 0

srv_check_result is 3 which  indicates the healthchecks are fine.


I'm a bit baffled by the situation. If someone has a bit more experience in inserting backends on the fly with L7 checks, i'll be gratefull.


--

Thomas Pedoussaut


Reply via email to