Hi,
HAProxy 2.9-dev5 was released on 2023/09/08. It added 108 new commits
after version 2.9-dev4.
About 30 bugs and build fixes were addressed, essentially the same as
those that were fixed in 2.8.3 so I won't rehash them here. One concerns
the way the cluster-secret is used to produce the retry tokens in QUIC.
It was relying on a nul-terminated string with a fallback on a few random
bytes meaning that short or even empty strings could be used. This was
addressed by always calculating a hash early from it, but this means that
retry tokens produced by this version will not be usable as-is with an
haproxy process running in an older version. This means that those who
may have multiple haproxy nodes dealing with QUIC behind an L4 load
balancer may have to upgrade all nodes at once to support retries (these
are used during network floods). This is a little bit annoying but it's
better to address this right now and not have to think about it anymore,
especially when considering that it's only used during attack mitigations.
We'll also backport this into the next 2.8 so the same precautions will
be needed and then it will only be an old memory.
Aside this, some new stuff was brought:
- CI cleanups and updates (musl, github actions v4, etc)
- minimal support for Linux capabilities. We've wanted to do this for
a long time for transparent proxying, interface binding etc to avoid
running the process as root at runtime but it was never done. Finally
we've seen some reports (thank you Tristan) where it was obvious that
QUIC was bound to privileged port and running as an unprivileged user
in the default socket-per-connection mode. And in this case the bind()
that we're doing in order to perform sort of a UDP accept() fails. It
was clear that such now mainstream use cases cannot warrant running as
root anymore so capabilities were added so that the cap_net_bind_service
capability can be preserved when switching from root to the final uid.
- Proxy-protocol can now expose the extra TLVs that are received so that
a new sample fetch (fc_pp_tlv) can retrieve them. If a front component
makes use of in-house TLVs to pass extra info, you can now exploit it
in haproxy.
- health checks: we've seen a config where there were so many servers
that health checks could not all complete in time at boot and resulted
in checks timing out and all servers turning down while the process was
at 100% CPU for a long period. Since 2.7, health checks can already
migrate between threads, so now we migrate them more aggressively when
other threads are less loaded. We can also benefit from the global
"spread-checks" value to artifically extend the interval to slow them
down on overloaded threads. Finally, a similar queuing mechanism as
the server's maxconn was implemented so that each thread engages in no
more than a certain number of checks at once, and extra ones wait in
queue. It's not enabled by default but the limit can be set using the
global "tune.max-checks-per-thread" setting. This managed to totally
address the problem even at 7k sustained checks per second without any
single failure.
- some cleanups and internal rework was done on the log server init code
in preparation for more changes to come. Such config checks used to be
centralized and are now done in the context of their proxy. No visible
change is expected.
- the master process will now display a more human-friendly message when
a worker crashes. It will also report the version and the link to the
known bugs page
- backend SSL: the SSL sessions are stored per-thread in order to limit
locking, but as a side effect, each thread had to perform its own
handshake if no connection could be reused. This was visible with
health checks which could be responsible for as many handshakes as
there are threads. It was not a problem in 1.8 when this was done,
considering that by then 4 threads was considered large and 8 threads
huge. But now with 64-thread configs becoming more and more common it
was really becoming inefficient. Thus now a thread will automatically
reuse the last updated SSL session if it doesn't have one. This also
participated to speeding up health checks convergence and reducing
the servers' load upon haproxy reload.
- some sink/ring code refactor (should not have any visible effect, so
please do report any unexpected change).
- for configs showing blatantly inconsistent CPU bindings, a warning
will be reported. The first case is that when thread sets are bound
to smaller CPU sets, thus forcing contention to happen, which will
ruin performance. The second case is when only some threads are
referenced in cpu-map and others are left anywhere, which might
very well be the same CPUs as the first ones. It will also produce
a warning. Please note that not all inconsistent bindings can be
detected, the combinations are too large to be resolved, but the
most common ones that happen by mistake are well detected.
- the HTTP client now allows to configure the default connect timeout
and retries count, so that failures can be reported much earlier for
communications inside the datacenter with short RTTs.
- SSL: the "curves" keyword is now supported on the "server" lines,
and a global "ssl-default-server-curves" keyword is added to set
the default.
- SSL: the build with the AWS-LC TLS stack should now work fine when
building with "USE_OPENSSL_AWSLC=1", however I seem to understand
that some algorithms necessary for QUIC are still not enabled.
The rest is mostly low-risk internal stuff to prepare future changes.
Please find the usual URLs below :
Site index : https://www.haproxy.org/
Documentation : https://docs.haproxy.org/
Wiki : https://github.com/haproxy/wiki/wiki
Discourse : https://discourse.haproxy.org/
Slack channel : https://slack.haproxy.org/
Issue tracker : https://github.com/haproxy/haproxy/issues
Sources : https://www.haproxy.org/download/2.9/src/
Git repository : https://git.haproxy.org/git/haproxy.git/
Git Web browsing : https://git.haproxy.org/?p=haproxy.git
Changelog : https://www.haproxy.org/download/2.9/src/CHANGELOG
Dataplane API :
https://github.com/haproxytech/dataplaneapi/releases/latest
Pending bugs : https://www.haproxy.org/l/pending-bugs
Reviewed bugs : https://www.haproxy.org/l/reviewed-bugs
Code reports : https://www.haproxy.org/l/code-reports
Latest builds : https://www.haproxy.org/l/dev-packages
Willy
---
Complete changelog :
Alexander Stephan (6):
CLEANUP/MINOR: connection: Improve consistency of PPv2 related constants
MEDIUM: connection: Generic, list-based allocation and look-up of PPv2
TLVs
MEDIUM: sample: Add fetch for arbitrary TLVs
MINOR: sample: Refactor fc_pp_authority by wrapping the generic TLV fetch
MINOR: sample: Refactor fc_pp_unique_id by wrapping the generic TLV fetch
MINOR: sample: Add common TLV types as constants for fc_pp_tlv
Andrew Hopkins (6):
BUILD: ssl: Build with new cryptographic library AWS-LC
REGTESTS: ssl: skip ssl_dh test with AWS-LC
CI: scripts: add support to build-ssl.sh to download and build AWS-LC
CI: add support to matrix.py to determine the latest AWS-LC release
CI: Update matrix.py so all code is contained in functions.
CI: github: Add a weekly CI run building with AWS-LC
Aurelien DARRAGON (18):
BUG/MINOR: hlua/action: incorrect message on E_YIELD error
MINOR: http_ana: position the FINAL flag for http_after_res execution
MINOR: ring: add a function to compute max ring payload
BUG/MEDIUM: ring: adjust maxlen consistency check
MINOR: sink: simplify post_sink_resolve function
MINOR: log/sink: detect when log maxlen exceeds sink size
MINOR: sink: inform the user when logs will be implicitly truncated
MEDIUM: sink: don't perform implicit truncations when maxlen is not set
MINOR: log: move log-forwarders cleanup in log.c
MEDIUM: httpclient/logs: rely on per-proxy post-check instead of global
one
MINOR: log: add dup_logsrv() helper function
MEDIUM: log/sink: make logsrv postparsing more generic
MEDIUM: fcgi-app: properly postresolve logsrvs
MEDIUM: spoe-agent: properly postresolve log rings
MINOR: sink: add helper function to deallocate sink struct
MEDIUM: sink/ring: introduce high level ring creation helper function
MEDIUM: sink: add sink_finalize() function
CLEANUP: log: remove unnecessary trim in __do_send_log
Chris Staite (1):
BUG/MEDIUM: h1-htx: Ensure chunked parsing with full output buffer
Christopher Faulet (13):
DEBUG: applet: Properly report opposite SC expiration dates in traces
BUG/MEDIUM: stconn: Update stream expiration date on blocked sends
BUG/MINOR: stconn: Don't report blocked sends during connection
establishment
BUG/MEDIUM: stconn: Wake applets on sending path if there is a pending
shutdown
BUG/MEDIUM: stconn: Don't block sends if there is a pending shutdown
BUG/MINOR: stconn: Don't inhibit shutdown on connection on error
BUG/MEDIUM: applet: Fix API for function to push new data in channels
buffer
BUG/MEDIUM: stconn: Report read activity when a stream is attached to
front SC
BUG/MEDIUM: applet: Report an error if applet request more room on
aborted SC
BUG/MEDIUM: stconn/stream: Forward shutdown on write timeout
NUG/MEDIUM: stconn: Always update stream's expiration date after I/O
BUG/MINOR: applet: Always expect data when CLI is waiting for a new
command
BUG/MINOR: ring/cli: Don't expect input data when showing events
Frédéric Lécaille (11):
BUG/MINOR: quic: Possible skipped RTT sampling
MINOR: quic: Add a trace to quic_release_frm()
BUG/MAJOR: quic: Really ignore malformed ACK frames.
BUG/MINOR: quic: Unchecked pointer to packet number space dereferenced
BUILD: quic: Compilation issue on 32-bits systems with
quic_may_send_bytes()
BUG/MINOR: quic: Unchecked pointer to Handshake packet number space
BUG/MINOR: quic: Wrong RTT adjusments
BUG/MINOR: quic: Wrong RTT computation (srtt and rrt_var)
BUG/MINOR: quic: Dereferenced unchecked pointer to Handshke packet number
space
BUG/MINOR: quic: Wrong cluster secret initialization
CLEANUP: quic: Remove useless free_quic_tx_pkts() function.
Ilya Shipitsin (2):
CI: musl: highlight section if there are coredumps
CI: musl: drop shopt in workflow invocation
Miroslav Zagorac (1):
MINOR: properly mark the end of the CLI command in error messages
Remi Tricot-Le Breton (1):
MINOR: cache: Change hash function in default normalizer used in case of
"vary"
Tim Duesterhus (1):
CI: Update to actions/checkout@v4
William Lallemand (7):
BUG/MINOR: ssl/cli: can't find ".crt" files when replacing a certificate
DOC: configuration: update examples for req.ver
MINOR: global: export the display_version() symbol
MEDIUM: mworker: display a more accessible message when a worker crash
MINOR: httpclient: allow to configure the retries
MINOR: httpclient: allow to configure the timeout.connect
MINOR: ssl: add support for 'curves' keyword on server lines
Willy Tarreau (41):
BUG/MEDIUM: mux-h2: fix crash when checking for reverse connection after
error
BUILD: import: guard plock.h against multiple inclusion
BUILD: pools: import plock.h to build even without thread support
BUG/MINOR: stream: protect stream_dump() against incomplete streams
DOC: config: mention uid dependency on the tune.quic.socket-owner option
MEDIUM: capabilities: enable support for Linux capabilities
MINOR: ssl_sock: avoid iterating realloc(+1) on stored context
DOC: ssl: add some comments about the non-obvious session allocation stuff
CLEANUP: ssl: keep a pointer to the server in ssl_sock_init()
MEDIUM: ssl_sock: always use the SSL's server name, not the one from the
tid
MEDIUM: server/ssl: place an rwlock in the per-thread ssl server session
MINOR: server/ssl: maintain an index of the last known valid SSL session
MINOR: server/ssl: clear the shared good session index on failure
MEDIUM: server/ssl: pick another thread's session when we have none yet
MINOR: activity: report the current run queue size
BUG/MINOR: checks: do not queue/wake a bounced check
MINOR: checks: start the checks in sleeping state
MINOR: checks: pin the check to its thread upon wakeup
MINOR: check: remember when we migrate a check
MINOR: check/activity: collect some per-thread check activity stats
MINOR: checks: maintain counters of active checks per thread
MINOR: check: also consider the random other thread's active checks
MEDIUM: checks: search more aggressively for another thread on overload
MEDIUM: checks: implement a queue in order to limit concurrent checks
MINOR: checks: also consider the thread's queue for rebalancing
BUG/MEDIUM: connection: fix pool free regression with recent ppv2 TLV
patches
BUG/MINOR: stream: further protect stream_dump() against incomplete
sessions
BUILD: bug: make BUG_ON() void to avoid a rare warning
BUILD: checks: shut up yet another stupid gcc warning
MINOR: cpuset: add ha_cpuset_isset() to check for the presence of a CPU
in a set
MINOR: cpuset: add ha_cpuset_or() to bitwise-OR two CPU sets
MINOR: cpuset: centralize a reliable bound cpu detection
MEDIUM: threads: detect incomplete CPU bindings
MEDIUM: threads: detect excessive thread counts vs cpu-map
MINOR: tasks/stats: report the number of niced tasks in "show info"
MEDIUM: init: initialize the trash earlier
MINOR: tools: add function read_line_to_trash() to read a line of a file
MINOR: cfgparse: use read_line_from_trash() to read from /sys
MEDIUM: cfgparse: assign NUMA affinity to cpu-maps
MINOR: cpuset: dynamically allocate cpu_map
REORG: cpuset: move parse_cpu_set() and parse_cpumap() to cpuset.c
---