On Tue, Jan 24, 2023 at 11:05:37PM +0100, Willy Tarreau wrote:
> On Tue, Jan 24, 2023 at 02:15:08PM -0600, Marc West wrote:
> > > Stupid question but I prefer to ask in order to be certain, are all of
> > > these 32 threads located on the same physical CPU ? I just want to be
> > > sure that locks (kernel or user) are not traveling between multiple CPU
> > > sockets, as that ruins scalability.
> > 
> > Very good observation. This is a 2 socket system, 20 cores per socket,
> > and since there is no affinity in OpenBSD unfortunately the 32 threads
> > are not on the same physical CPU.
> 
> Ah, then that's particularly bad. When you're saying "there is no
> affinity", do you mean the OS doesn't implement anything or only that
> haproxy doesn't support it ? Could you please check if you have man
> pages about "cpuset_setaffinity" or "sched_setaffinity" ? The former
> is the FreeBSD version and the second the Linux version.

Unfortunately, having a quick look at OpenBSD's sources, I don't think
they provide anything that could look like that. In fact, I'm not sure
there is any way to restrict a process to a specific CPU.

[...]

> Oh thank you very much. From what I'm seeing we're here:
> 
>   0x00000af892c770b0 <conn_backend_get+1136>:   mov    %r12,%rdi
>   0x00000af892c770b3 <conn_backend_get+1139>:   callq  0xaf892c24e40 
> <srv_lookup_conn_next>
>   0x00000af892c770b8 <conn_backend_get+1144>:   mov    %rax,%r12
>   0x00000af892c770bb <conn_backend_get+1147>:   test   %rax,%rax
>   0x00000af892c770be <conn_backend_get+1150>:   je     0xaf892c770e5 
> <conn_backend_get+1189>
>   0x00000af892c770c0 <conn_backend_get+1152>:   mov    0x18(%r12),%rax
> =>0x00000af892c770c5 <conn_backend_get+1157>:   mov    0xa0(%rax),%r11
>   0x00000af892c770cc <conn_backend_get+1164>:   test   %r11,%r11
>   0x00000af892c770cf <conn_backend_get+1167>:   je     0xaf892c770b0 
> <conn_backend_get+1136>
>   0x00000af892c770d1 <conn_backend_get+1169>:   mov    %r12,%rdi
>   0x00000af892c770d4 <conn_backend_get+1172>:   mov    %r13d,%esi
> 
>   1229                          conn = 
> srv_lookup_conn(&srv->per_thr[i].safe_conns, hash);
>   1230                          while (conn) {
>   1231        ======>>>                 if (conn->mux->takeover && 
> conn->mux->takeover(conn, i) == 0) {
>                                             ^^^^^^^^^^^
> 
> It's here. %rax==0, which is conn->mux when we're trying to dereference
> it to retrieve ->takeover. I can't see how it's possible to have a NULL
> mux here in an idle connection since they're placed there by the muxes
> themselves. But maybe there's a tiny race somewhere that's impossible to
> reach except under such an extreme contention between the two sockets.
> I really have no idea regarding this one for now, maybe Olivier could
> spot a race here ? Maybe we're missing a barrier somewhere.
 
One way I can see that happening, as unlikely as it may be, is if
h1_takeover() is called, but fails to allocate a task or a tasklet, and
will call h1_release(), which will set the mux to NULL and call
conn_free(). It was previously ok to happen, because conn_free() would
unconditionally remove the connection from any list, but since the idle
connections now stay in a tree, and no longer in a list, that no longer
happens.
Another possibility of h1_release() been called while still in the
idle tree is if h1_wake() gets called, but the only way I see that
happening is from mmux_stopping_process(), whatever that is, so that's
unlikely to be the problem in this case.
Anyway, I suggest doing something along the line of the patch attached,
and add a BUG_ON() in conn_free() to catch any freeing of connection
still in the idle tree.

Olivier
>From 2a0ac4b84b97aa05cd7befc5fcf45b03795f2e76 Mon Sep 17 00:00:00 2001
From: Olivier Houchard <cog...@ci0.org>
Date: Tue, 24 Jan 2023 23:59:32 +0100
Subject: [PATCH] MINOR: Add a BUG_ON() to detect destroying connection in idle
 list

Add a BUG_ON() in conn_free(), to check that zhen we're freeing a
connection, it is not still in the idle connections tree, otherwise the
next thread that will try to use it will probably crash.
---
 src/connection.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/src/connection.c b/src/connection.c
index 4a73dbcc8..97619ec26 100644
--- a/src/connection.c
+++ b/src/connection.c
@@ -498,6 +498,10 @@ void conn_free(struct connection *conn)
 	pool_free(pool_head_uniqueid, istptr(conn->proxy_unique_id));
 	conn->proxy_unique_id = IST_NULL;
 
+	/* Make sure the connection is not left in the idle connection tree */
+	if (conn->hash_node != NULL)
+		BUG_ON(conn->hash_node->node.node.leaf_p != NULL);
+
 	pool_free(pool_head_conn_hash_node, conn->hash_node);
 	conn->hash_node = NULL;
 
-- 
2.36.1

Reply via email to