Le 12/03/2018 à 09:48, Christopher Faulet a écrit :
Le 05/03/2018 à 21:06, Dennis Jacobfeuerborn a écrit :
Hi,
today I started experimenting with the HAProxy 1.8.4 release and ran
into a Problem when it comes to reloading the configuration (USR2).
I'm running the release tarball via RPM on a CentOS 7 System in
master-worker mode and every time I perform a "systemctl reload" the
worker process suddenly uses 100% cpu and can only be killed via "kill -9".
Once I do so the old processes go away and two new ones show up and
everything goes back to normal.

After reducing the config bit by bit I was able to isolate the config
that is causing this:

bind abns@haproxy-ssl accept-proxy

Once I replace the "abns@haproxy-ssl" bit with an ip:port combination
the problem goes away so it seems the "abns@" part is the culprit here.

Regards,
    Dennis


Hi Dennis,

A recent commit could fix your problem:

    http://git.haproxy.org/?p=haproxy.git;a=commitdiff;h=ec9516a6

It was pushed in upstream and will be backported in 1.8. Could you check
if it works please ?


Hi,

I finally found some time to investigate this issue and found an obvious bug. Here is the patch.

Willy, unless you see something wrong in my patch, I think you can safely merge it right now.

Thanks,
--
Christopher Faulet
>From 8a175489c5f7febfb4fb0d0b137ff64e49f323a8 Mon Sep 17 00:00:00 2001
From: Christopher Faulet <[email protected]>
Date: Fri, 16 Mar 2018 10:04:47 +0100
Subject: [PATCH] BUG/MEDIUM: threads/unix: Fix a deadlock when a listener is
 temporarily disabled

When a listener is temporarily disabled, we start by locking it and then we call
.pause callback of the underlying protocol (tcp/unix). For TCP listeners, this
is not a problem. But listeners bound on an unix socket are in fact closed
instead. So .pause callback relies on unbind_listener function to do its job.

Unfortunatly, unbind_listener hold the listener's lock and then call an internal
function to unbind it. So, there is a deadlock here. This happens during a
reload. To fix the problemn, the function do_unbind_listener, which is lockless,
is now exported and is called when a listener bound on an unix socket is
temporarily disabled.

This patch must be backported in 1.8.
---
 include/proto/listener.h |  5 +++++
 src/listener.c           | 23 ++++++++++-------------
 src/proto_uxst.c         |  4 +++-
 3 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/include/proto/listener.h b/include/proto/listener.h
index a33aeed4..fdace2b5 100644
--- a/include/proto/listener.h
+++ b/include/proto/listener.h
@@ -60,6 +60,11 @@ int disable_all_listeners(struct protocol *proto);
 /* Dequeues all of the listeners waiting for a resource in wait queue <queue>. */
 void dequeue_all_listeners(struct list *list);
 
+/* Must be called with the lock held. Depending on <do_close> value, it does
+ * what unbind_listener or unbind_listener_no_close should do.
+ */
+void do_unbind_listener(struct listener *listener, int do_close);
+
 /* This function closes the listening socket for the specified listener,
  * provided that it's already in a listening state. The listener enters the
  * LI_ASSIGNED state. This function is intended to be used as a generic
diff --git a/src/listener.c b/src/listener.c
index fd707268..14e31ca0 100644
--- a/src/listener.c
+++ b/src/listener.c
@@ -49,8 +49,6 @@ static struct bind_kw_list bind_keywords = {
 
 struct xfer_sock_list *xfer_sock_list = NULL;
 
-static void __do_unbind_listener(struct listener *listener, int do_close);
-
 /* This function adds the specified listener's file descriptor to the polling
  * lists if it is in the LI_LISTEN state. The listener enters LI_READY or
  * LI_FULL state depending on its number of connections. In deamon mode, we
@@ -68,9 +66,9 @@ static void enable_listener(struct listener *listener)
 			 * want any fd event to reach it.
 			 */
 			if (!(global.tune.options & GTUNE_SOCKET_TRANSFER))
-				__do_unbind_listener(listener, 1);
+				do_unbind_listener(listener, 1);
 			else {
-				__do_unbind_listener(listener, 0);
+				do_unbind_listener(listener, 0);
 				listener->state = LI_LISTEN;
 			}
 		}
@@ -308,8 +306,10 @@ void dequeue_all_listeners(struct list *list)
 	HA_SPIN_UNLOCK(LISTENER_QUEUE_LOCK, &lq_lock);
 }
 
-/* must be called with the lock held */
-static void __do_unbind_listener(struct listener *listener, int do_close)
+/* Must be called with the lock held. Depending on <do_close> value, it does
+ * what unbind_listener or unbind_listener_no_close should do.
+ */
+void do_unbind_listener(struct listener *listener, int do_close)
 {
 	if (listener->state == LI_READY)
 		fd_stop_recv(listener->fd);
@@ -331,13 +331,6 @@ static void __do_unbind_listener(struct listener *listener, int do_close)
 	}
 }
 
-static void do_unbind_listener(struct listener *listener, int do_close)
-{
-	HA_SPIN_LOCK(LISTENER_LOCK, &listener->lock);
-	__do_unbind_listener(listener, do_close);
-	HA_SPIN_UNLOCK(LISTENER_LOCK, &listener->lock);
-}
-
 /* This function closes the listening socket for the specified listener,
  * provided that it's already in a listening state. The listener enters the
  * LI_ASSIGNED state. This function is intended to be used as a generic
@@ -345,7 +338,9 @@ static void do_unbind_listener(struct listener *listener, int do_close)
  */
 void unbind_listener(struct listener *listener)
 {
+	HA_SPIN_LOCK(LISTENER_LOCK, &listener->lock);
 	do_unbind_listener(listener, 1);
+	HA_SPIN_UNLOCK(LISTENER_LOCK, &listener->lock);
 }
 
 /* This function pretends the listener is dead, but keeps the FD opened, so
@@ -353,7 +348,9 @@ void unbind_listener(struct listener *listener)
  */
 void unbind_listener_no_close(struct listener *listener)
 {
+	HA_SPIN_LOCK(LISTENER_LOCK, &listener->lock);
 	do_unbind_listener(listener, 0);
+	HA_SPIN_UNLOCK(LISTENER_LOCK, &listener->lock);
 }
 
 /* This function closes all listening sockets bound to the protocol <proto>,
diff --git a/src/proto_uxst.c b/src/proto_uxst.c
index 19a28838..b4de1714 100644
--- a/src/proto_uxst.c
+++ b/src/proto_uxst.c
@@ -396,7 +396,9 @@ static int uxst_pause_listener(struct listener *l)
 	if (((struct sockaddr_un *)&l->addr)->sun_path[0])
 		return 1;
 
-	unbind_listener(l);
+	/* Listener's lock already held. Call lockless version of
+	 * unbind_listener. */
+	do_unbind_listener(l, 1);
 	return 0;
 }
 
-- 
2.14.3

Reply via email to