Hi, HAProxy 3.2-dev8 was released on 2025/03/21. It added 119 new commits after version 3.2-dev7.
As mentioned in the 3.1.6 announcement, a few bugs were addressed, but nothing critical. For the new stuff: - automatic CPU binding (formerly known as "NUMA patches"): this work that started almost two years ago and which I hoped to see merged into each version since 2.9 was finally completed! This extends the current CPU topology detection to better bind threads and thread groups. First, by default, nothing will change in 3.2 compared to previous versions. The new features will consist in detecting the detailed CPU topology, hence nodes, packages, CCX, L3 caches, cores, clusters, threads, etc and do the best to optimally bind to them and arrange the groups to limit the costly inter-CCX communications. It comes with a "cpu-set" directive that allows to only bind to, or exclude, certain CPUs based on their node/core/thread/cluster number. For example if one wants to only bind to odd or even threads to leave the other ones for the NIC drivers, it is trivial to do with a single directive. Second, another directive, "cpu-policy", describes how to use the selected CPUs. The default one, "first-usable-node", does exactly like today, i.e. it will only bind to the first node with available CPUs and limit itself to a single group and 64 threads max. Another policy is "group-by-cluster", it will create one thread group per CCX/L3 cache and configure as many threads as there are enabled CPUs on them. It can also create multiple groups if there are more than 64 CPUs in one of them. It's possible that it will be come the default policy starting with 3.3, as it can use the full machine in an efficient way. Just using this one was sufficient to multiply the performance by 3 on a 64-core EPYC, i.e. it was the same as what can be achieved using precise "cpu-map" directives which become quite difficult to use with many-core systems. A few other policies are available for CPUs with P+E cores to prefer "Performance" cores or "Efficiency" cores. We're interested in feedback from those dealing with large systems, particularly multi-socket ones, as well as VMs and containers, to make sure we haven't missed anything. Many tests were run on about 20-25 different systems, as well as emulations of about 10 other ones based on /sys captures. For those who prefer, I have created a discussion here on GitHub, feel free to participate and share feedback (successes, failures and suggestions): https://github.com/orgs/haproxy/discussions/2901 - Prometheus and stats convergence: those using Prometheus probably noticed it from time to time, it's difficult to keep the two synchronized, so sometimes we add some new stats and forget to do the same to Prometheus. Some changes were made to extend the stats internal representation so that Prometheus can rely on this. This way there is now a single place to declare new metrics that should be exposed at the two places. If well done, it should not change anything (actually the only thing is that the warnings counter will finally be exported by Prometheus). Please give it a try to confirm that everything runs as smoothly as expected. - the log-forward sections now support an "option host" to decide how to fill the host part of outgoing log messages (leave it as-is, replace it, append), since different users expect different behaviors. - some new converters are provided to support JWS signing and verify JSON Web Token (JWT). Please just bear with me, I have zero idea about what JWS means nor what it's used for, but there are info in the doc about it :-) Apparently it's related to authentication. - some changes were made to the internal representation of certificates that are not expected to have any visible effect. If you're using complex setups, please give it a quick try to verify that you don't face any error at load time. - the "wait ... srv-removable" CLI command was optimised so that it consumes much less CPU while waiting for a server to be removable. It used to force thread isolation during the check but thanks to some recent changes this is no longer necessary, so those with many servers being constantly added and removed at run time and who used to notice CPU spikes when a whole farm went down will see a significant improvement. - a small "show pools detailed" CLI command will now show all pools registered behind a single entry. That's useless for normal users but developers might ask about this in the future when chasing a memory error. - we found a case on a 128-thread EPYC where some watchdog warnings could be emitted from time to time under extreme contention on the mt_lists, indicating that some CPUs were blocked for at least 100ms. We found it was caused by the high margin in the exponential back-off which seems too high for these CPUs, so we shortened it. If you had faced warnings in the past, we're interesting in knowing if they disappeared. If you observe a higher CPU usage, we're interested as well (this shouldn't be the case based on our tests). - The Lua's AppletTCP:receive() now supports an optional timeout, making it easier to write interactive utilities supporting a periodic refresh (think about a "top" equivalent for example). For the record, this allowed to write a dirty "tetris" game that works as an applet. I have not committed it yet because it needs some polishing but it illustrates some possibilities and showed us some limitations and even two bugs. We hope to address such small limitations before 3.2-final, so that they ease the writing of convenient utilities, including sniffers, proxies etc, not just arcade games ;-) The rest is a few cleanups and doc updates. I'm really insisting that sensitive changes are merged before dev9, that is due for first week of April. Past this point we'll declare the feature freeze which as usual will mainly mean "no more big change", so that we can spend the rest of the time finishing what's already started and polishing/fixing what's already merged. I know that there are some SSL infrastructure updates in the pipe, and a rework of leastconn to address the scalability issues on large systems. We've identified a number of small cleanups that are worth doing before 3.2-final (e.g. minor changes to Lua mentioned above, merge of h2+h3 header validation etc). Also the doc updates (namely the resolvers with init_addr that Lukas & Luke worked on) need to be decided on and merged. Overall I'm starting to like what 3.2 is becoming. It could also be the moment to think about the more intense changes to perform in 3.3 (e.g. if we need to anticipate deprecation warnings it's not too late), and sometimes doing some preparatory work before the release eases the backport of fixes later. Next week I'll be quite busy so maybe not always available to respond to discussions but do not hesitate to share anything you might have in mind ;-) Ah and please if you have not yet started to play with 3.2-dev, really, give it a try *NOW*. There's still time to fix issues, rename options etc, and it's in good shape, close to what 3.2-final should be. And if you're lucky you might even notice improvements which will make you want to stick to it. Please find the usual URLs below : Site index : https://www.haproxy.org/ Documentation : https://docs.haproxy.org/ Wiki : https://github.com/haproxy/wiki/wiki Discourse : https://discourse.haproxy.org/ Slack channel : https://slack.haproxy.org/ Issue tracker : https://github.com/haproxy/haproxy/issues Sources : https://www.haproxy.org/download/3.2/src/ Git repository : https://git.haproxy.org/git/haproxy.git/ Git Web browsing : https://git.haproxy.org/?p=haproxy.git Changelog : https://www.haproxy.org/download/3.2/src/CHANGELOG Dataplane API : https://github.com/haproxytech/dataplaneapi/releases/latest Pending bugs : https://www.haproxy.org/l/pending-bugs Reviewed bugs : https://www.haproxy.org/l/reviewed-bugs Code reports : https://www.haproxy.org/l/code-reports Latest builds : https://www.haproxy.org/l/dev-packages Willy --- Complete changelog : Amaury Denoyelle (2): BUG/MEDIUM: mux-quic: fix crash on RS/SS emission if already close local BUG/MINOR: mux-quic: remove extra BUG_ON() in _qcc_send_stream() Aurelien DARRAGON (29): CLEANUP: log-forward: remove useless options2 init CLEANUP: log: add syslog_process_message() helper MINOR: proxy: add proxy->options3 MINOR: log: migrate log-forward options from proxy->options2 to options3 MINOR: log: provide source address information in syslog_process_message() MINOR: tools: only print address in sa2str() when port == -1 MINOR: log: add "option host" log-forward option MINOR: log: handle log-forward "option host" MEDIUM: log: change default "host" strategy for log-forward section DOC: management: rename some last occurences from domain "dns" to "resolvers" BUG/MINOR: stats: fix capabilities and hide settings for some generic metrics BUG/MINOR: log: prevent saddr NULL deref in syslog_io_handler() BUG/MINOR: hlua: fix optional timeout argument index for AppletTCP:receive() BUG/MEDIUM: hlua/cli: fix cli applet UAF in hlua_applet_wakeup() MINOR: stats: add .generic explicit field in stat_col struct MINOR: stats: STATS_PX_CAP___B_ macro MINOR: stats: add .cap for some static metrics MINOR: stats: use stat_col storage stat_cols_info MEDIUM: promex: switch to using stat_cols_info for global metrics MINOR: promex: expose ST_I_INF_WARNINGS (AKA total_warnings) metric MEDIUM: promex: switch to using stat_cols_px for front/back/server metrics MINOR: stats: explicitly add frontend cap for ST_I_PX_REQ_TOT CLEANUP: promex: remove unused PROMEX_FL_{INFO,FRONT,BACK,LI,SRV} flags MINOR: stats: add alt_name field to stat_col struct MINOR: stats: add alt name info to stat_cols_info where relevant MINOR: promex: get rid of promex_global_metric array MINOR: stats-proxy: add alt_name field for ME_NEW_{FE,BE,PX} helpers MINOR: stats-proxy: add alt name info to stat_cols_px where relevant MINOR: promex: get rid of promex_st_metrics array Christopher Faulet (1): BUG/MINOR: mux-h2: Reset streams with NO_ERROR code if full response was already sent Olivier Houchard (1): MEDIUM: mt_list: Reduce the max number of loops with exponential backoff Valentine Krasnobaeva (3): MINOR: cpu-topo: fix unused stack var 'cpu2' reported by coverity BUG/MINOR: limits: compute_ideal_maxconn: don't cap remain if fd_hard_limit=0 MINOR: limits: fix check_if_maxsock_permitted description William Lallemand (7): MINOR: jws: implement JWS signing TESTS: jws: implement a test for JWS signing CI: github: add "jose" to apt dependencies MINOR: jws: add new functions in jws.h MINOR: jws: use jwt_alg type instead of a char MINOR: tools: path_base() concatenates a path with a base path MEDIUM: ssl/ckch: make the ckch_conf more generic Willy Tarreau (76): BUG/MEDIUM: thread: use pthread_self() not ha_pthread[tid] in set_affinity MINOR: compiler: add a simple macro to concatenate resolved strings MINOR: compiler: add a new __decl_thread_var() macro to declare local variables BUILD: tools: silence a build warning when USE_THREAD=0 BUILD: backend: silence a build warning when threads are disabled MINOR: cli: export cli_io_handler() to ease symbol resolution MINOR: tools: improve symbol resolution without dl_addr MINOR: tools: ease the declaration of known symbols in resolve_sym_name() MINOR: tools: teach resolve_sym_name() a few more common symbols BUILD: tools: avoid a build warning on gcc-4.8 in resolve_sym_name() DEV: ncpu: also emulate sysconf() for _SC_NPROCESSORS_* DOC: design-thoughts: commit numa-auto.txt MINOR: cpuset: make the API support negative CPU IDs MINOR: thread: rely on the cpuset functions to count bound CPUs MINOR: cpu-topo: add ha_cpu_topo definition MINOR: cpu-topo: allocate and initialize the ha_cpu_topo array. MINOR: cpu-topo: rely on _SC_NPROCESSORS_CONF to trim maxcpus MINOR: cpu-topo: add a function to dump CPU topology MINOR: cpu-topo: update CPU topology from excluded CPUs at boot REORG: cpu-topo: move bound cpu detection from cpuset to cpu-topo MINOR: cpu-topo: add detection of online CPUs on Linux MINOR: cpu-topo: add detection of online CPUs on FreeBSD MINOR: cpu-topo: try to detect offline cpus at boot MINOR: cpu-topo: add CPU topology detection for linux MINOR: cpu-topo: also store the sibling ID with SMT MINOR: cpu-topo: add NUMA node identification to CPUs on Linux MINOR: cpu-topo: add NUMA node identification to CPUs on FreeBSD MINOR: thread: turn thread_cpu_mask_forced() into an init-time variable MINOR: cfgparse: move the binding detection into numa_detect_topology() MINOR: cfgparse: use already known offline CPU information MINOR: global: add a command-line option to enable CPU binding debugging MINOR: cpu-topo: add a new "cpu-set" global directive to choose cpus MINOR: cpu-topo: add "drop-cpu" and "only-cpu" to cpu-set MEDIUM: thread: start to detect thread groups and threads min/max MEDIUM: cpu-topo: make sure to properly assign CPUs to threads as a fallback MEDIUM: thread: reimplement first numa node detection MEDIUM: cfgparse: remove now unused numa & thread-count detection MINOR: cpu-topo: refine cpu dump output to better show kept/dropped CPUs MINOR: cpu-topo: fall back to nominal_perf and scaling_max_freq for the capacity MINOR: cpu-topo: use cpufreq before acpi cppc MINOR: cpu-topo: boost the capacity of performance cores with cpufreq MINOR: cpu-topo: skip CPU detection when /sys/.../cpu does not exist MINOR: cpu-topo: skip identification of non-existing CPUs MINOR: cpu-topo: skip CPU properties that we've verified do not exist MINOR: cpu-topo: implement a sorting mechanism for CPU index MINOR: cpu-topo: implement a sorting mechanism by CPU locality MINOR: cpu-topo: implement a CPU sorting mechanism by cluster ID MINOR: cpu-topo: ignore single-core clusters MINOR: cpu-topo: assign clusters to cores without and renumber them MINOR: cpu-topo: make sure we don't leave unassigned IDs in the cpu_topo MINOR: cpu-topo: assign an L3 cache if more than 2 L2 instances MINOR: cpu-topo: renumber cores to avoid holes and make them contiguous MINOR: cpu-topo: add a function to sort by cluster+capacity MINOR: cpu-topo: consider capacity when forming clusters MINOR: cpu-topo: create an array of the clusters MINOR: cpu-topo: ignore excess of too small clusters MINOR: cpu-topo: add "only-node" and "drop-node" to cpu-set MINOR: cpu-topo: add "only-thread" and "drop-thread" to cpu-set MINOR: cpu-topo: add "only-core" and "drop-core" to cpu-set MINOR: cpu-topo: add "only-cluster" and "drop-cluster" to cpu-set MINOR: cpu-topo: add a CPU policy setting to the global section MINOR: cpu-topo: add a 'first-usable-node' cpu policy MEDIUM: cpu-topo: use the "first-usable-node" cpu-policy by default CLEANUP: thread: now remove the temporary CPU node binding code MINOR: cpu-topo: add cpu-policy "group-by-cluster" MEDIUM: cpu-topo: let the "group-by-cluster" split groups MINOR: cpu-topo: add a new "performance" cpu-policy MINOR: cpu-topo: add a new "efficiency" cpu-policy MINOR: cpu-topo: add a new "resource" cpu-policy MINOR: hlua: add an optional timeout to AppletTCP:receive() MINOR: stream: decrement srv->served after detaching from the list MINOR: server: simplify srv_has_streams() CLEANUP: server: make it clear that srv_check_for_deletion() is thread-safe MINOR: cli/server: don't take thread isolation to check for srv-removable MINOR: pools: rename the "by_what" field of the show pools context to "how" MINOR: cli/pools: record the list of pool registrations even when merging them ---