Le 11/12/2020 à 11:45, Christopher Faulet a écrit :
Le 10/12/2020 à 19:38, Peter Statham a écrit :
> Sorry for the delay in getting back to you. It is the same crash,
> we've been trying to narrow down the exact combination of compiler,
> libraries, kernel, hypervisor, etc. that causes the issue now that we
> know it isn't universal but that's turning out to be trickier than
> identifying the issue.
>
> I only backported the changes to the src/lb_fwlc.c file, but
> backporting 1b87748ff5 seems to work just as well. So far we haven't
> been able to provoke the issue with the changes in 1b87748ff5 applied
> to the 1.8 tree so that does look like a solution.
>
> We will keep testing and trying to narrow the issue down.
Since I wrote the above I have managed to replicate the issue on 1.8 with
applied, so it looks as if that was not the solution after all.
I include a binary built from 1.8.27 with 1b87748ff5 backported and a core dump.
haproxy-1.8.27+1b87748ff5
<https://drive.google.com/file/d/1KPs3rBpkeqE9GEOfjF8Ocycd1wa4RjqW/view?usp=drive_web>
haproxy-1.8.27+1b87748ff5.core
<https://drive.google.com/file/d/1chBPoogHBuGlnV1o5sO9YP6BldpRH4d3/view?usp=drive_web>
Thanks Peter, I'll try to take a look today. The reproducer is the same ?
Ok, in fact it is pretty easy to reproduce. Because I found a similar bug on
newer versions, I have not tested on the 1.8. Unfortunately, there is second
bug, specific to the 1.8.
I attached a patch that should fix it. In fact, the bug exists because of the
rendez-vous point. It was removed on newer versions. But, on 1.8, there may have
a short time to commit server state changes because we must wait for all
threads. Thus, we must take care to not use info of the next state too early.
And this is the bug here. In the leasconn algo, the next server weight is used,
instead of the current one, to reposition the server in the tree. The next
server weight must only be used when the server state changes are committed.
Peter, could you confirm it fixes you bug ?
--
Christopher Faulet
>From 89c1eb3a9a1cc0b1d9743a74d299a329cbe3b910 Mon Sep 17 00:00:00 2001
From: Christopher Faulet <[email protected]>
Date: Fri, 11 Dec 2020 15:36:01 +0100
Subject: [PATCH] BUG/MEDIUM: lb-leastconn: Reposition a server using the right
eweight
Depending on the context, the current eweight or the next one must be used
to reposition a server in the tree. When the server state is updated, for
instance its weight, the next eweight must be used because it is not yet
committed. However, when the server is used, on normal conditions, the
current eweight must be used.
It is important to do so, because the server state is updated and committed
inside the rendez-vous point. Thus, the next server state may be unsync with
the current state for a short time, waiting all threads join the rendez-vous
point. It is especially a problem if the next eweight is set to 0. Because
otherwise, it must not be used to reposition the server in the tree, leading
to a divide by 0.
This bug is specific to the 1.8. On previous versions, there is no thread
support. On newer ones, the rendez-vous point was removed. Thus, there is no
upstream commit ID for this patch. And no backport is needed.
---
src/lb_fwlc.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/src/lb_fwlc.c b/src/lb_fwlc.c
index e2740994e..decb38697 100644
--- a/src/lb_fwlc.c
+++ b/src/lb_fwlc.c
@@ -37,13 +37,19 @@ static inline void fwlc_dequeue_srv(struct server *s)
eb32_delete(&s->lb_node);
}
-/* Queue a server in its associated tree, assuming the weight is >0.
+/* Queue a server in its associated tree, assuming the <eweight> is >0.
* Servers are sorted by #conns/weight. To ensure maximum accuracy,
* we use #conns*SRV_EWGHT_MAX/eweight as the sorting key.
+ *
+ * NOTE: Depending on the calling context, we use s->next_eweight or
+ * s->cur_eweight. The next value is used when the server state is updated
+ * (because the weight changed for instance). During this step, the server
+ * state is not yet committed. The current value is used to reposition the
+ * server in the tree. This happens when the server is used.
*/
-static inline void fwlc_queue_srv(struct server *s)
+static inline void fwlc_queue_srv(struct server *s, unsigned int eweight)
{
- s->lb_node.key = s->served * SRV_EWGHT_MAX / s->next_eweight;
+ s->lb_node.key = s->served * SRV_EWGHT_MAX / eweight;
eb32_insert(s->lb_tree, &s->lb_node);
}
@@ -56,7 +62,7 @@ static void fwlc_srv_reposition(struct server *s)
HA_SPIN_LOCK(LBPRM_LOCK, &s->proxy->lbprm.lock);
if (s->lb_tree) {
fwlc_dequeue_srv(s);
- fwlc_queue_srv(s);
+ fwlc_queue_srv(s, s->cur_eweight);
}
HA_SPIN_UNLOCK(LBPRM_LOCK, &s->proxy->lbprm.lock);
}
@@ -161,7 +167,7 @@ static void fwlc_set_server_status_up(struct server *srv)
}
/* note that eweight cannot be 0 here */
- fwlc_queue_srv(srv);
+ fwlc_queue_srv(srv, srv->next_eweight);
out_update_backend:
/* check/update tot_used, tot_weight */
@@ -216,7 +222,7 @@ static void fwlc_update_server_weight(struct server *srv)
srv->lb_tree = &p->lbprm.fwlc.act;
}
- fwlc_queue_srv(srv);
+ fwlc_queue_srv(srv, srv->next_eweight);
update_backend_weight(p);
srv_lb_commit_status(srv);
@@ -254,7 +260,7 @@ void fwlc_init_server_tree(struct proxy *p)
if (!srv_currently_usable(srv))
continue;
srv->lb_tree = (srv->flags & SRV_F_BACKUP) ? &p->lbprm.fwlc.bck : &p->lbprm.fwlc.act;
- fwlc_queue_srv(srv);
+ fwlc_queue_srv(srv, srv->next_eweight);
}
}
--
2.26.2