[Ubuntu-ha] [Bug 1848902] Re: haproxy in bionic can get stuck

Christian Ehrhardt  Tue, 12 Nov 2019 04:36:39 -0800

** Description changed:

+ [Impact]
+ 
+  * The master process will exit with the status of the last worker. 
+    When the worker is killed with SIGTERM, it is expected to get 143 as an
+    exit status. Therefore, we consider this exit status as normal from a
+    systemd point of view. If it happens when not stopping, the systemd
+    unit is configured to always restart, so it has no adverse effect.
+ 
+  * Backport upstream fix - adding another accepted RC to the systemd 
+    service
+ 
+ [Test Case]
+ 
+  * You want to install haproxy and have it running. Then sigterm it a lot.
+    With the fix it would restart the service all the time, well except 
+    restart limit. But in the bad case it will just stay down and didn't 
+    even try to restart it.
+ 
+    $ apt install haproxy
+    $ for x in {1..100}; do pkill -TERM -x haproxy ; sleep 0.1 ; done
+    $ systemctl status haproxy
+ 
+ [Regression Potential]
+ 
+  * This eventually is a conffile modification, so if there are other 
+    modifications done by the user they will get a prompt. But that isn't a 
+    regression. I checked the code and I can't think of another RC=143 that 
+    would due to that "no more" detected as error. I really think other 
+    than the update itself triggering a restart (as usual for services) 
+    there is no further regression potential to this.
+ 
+ [Other Info]
+  
+  * Fix already active in IS hosted cloud without issues since a while
+  * Also reports (comment #5) show that others use this in production as 
+    well
+ 
+ ---
+ 
  On a Bionic/Stein cloud, after a network partition, we saw several units
  (glance, swift-proxy and cinder) fail to start haproxy, like so:
  
  root@juju-df624b-6-lxd-4:~# systemctl status haproxy.service
  ● haproxy.service - HAProxy Load Balancer
-    Loaded: loaded (/lib/systemd/system/haproxy.service; enabled; vendor 
preset: enabled)
-    Active: failed (Result: exit-code) since Sun 2019-10-20 00:23:18 UTC; 1h 
35min ago
-      Docs: man:haproxy(1)
-            file:/usr/share/doc/haproxy/configuration.txt.gz
-   Process: 2002655 ExecStart=/usr/sbin/haproxy -Ws -f $CONFIG -p $PIDFILE 
$EXTRAOPTS (code=exited, status=143)
-   Process: 2002649 ExecStartPre=/usr/sbin/haproxy -f $CONFIG -c -q $EXTRAOPTS 
(code=exited, status=0/SUCCESS)
-  Main PID: 2002655 (code=exited, status=143)
+    Loaded: loaded (/lib/systemd/system/haproxy.service; enabled; vendor 
preset: enabled)
+    Active: failed (Result: exit-code) since Sun 2019-10-20 00:23:18 UTC; 1h 
35min ago
+      Docs: man:haproxy(1)
+            file:/usr/share/doc/haproxy/configuration.txt.gz
+   Process: 2002655 ExecStart=/usr/sbin/haproxy -Ws -f $CONFIG -p $PIDFILE 
$EXTRAOPTS (code=exited, status=143)
+   Process: 2002649 ExecStartPre=/usr/sbin/haproxy -f $CONFIG -c -q $EXTRAOPTS 
(code=exited, status=0/SUCCESS)
+  Main PID: 2002655 (code=exited, status=143)
  
  Oct 20 00:16:52 juju-df624b-6-lxd-4 systemd[1]: Starting HAProxy Load 
Balancer...
  Oct 20 00:16:52 juju-df624b-6-lxd-4 systemd[1]: Started HAProxy Load Balancer.
  Oct 20 00:23:18 juju-df624b-6-lxd-4 systemd[1]: Stopping HAProxy Load 
Balancer...
  Oct 20 00:23:18 juju-df624b-6-lxd-4 haproxy[2002655]: [WARNING] 292/001652 
(2002655) : Exiting Master process...
  Oct 20 00:23:18 juju-df624b-6-lxd-4 haproxy[2002655]: [ALERT] 292/001652 
(2002655) : Current worker 2002661 exited with code 143
  Oct 20 00:23:18 juju-df624b-6-lxd-4 haproxy[2002655]: [WARNING] 292/001652 
(2002655) : All workers exited. Exiting... (143)
  Oct 20 00:23:18 juju-df624b-6-lxd-4 systemd[1]: haproxy.service: Main process 
exited, code=exited, status=143/n/a
  Oct 20 00:23:18 juju-df624b-6-lxd-4 systemd[1]: haproxy.service: Failed with 
result 'exit-code'.
  Oct 20 00:23:18 juju-df624b-6-lxd-4 systemd[1]: Stopped HAProxy Load Balancer.
  root@juju-df624b-6-lxd-4:~#
  
  The Debian maintainer came up with the following patch for this:
  
-   https://www.mail-archive.com/haproxy@formilux.org/msg30477.html
+   https://www.mail-archive.com/haproxy@formilux.org/msg30477.html
  
  Which was added to the 1.8.10-1 Debian upload and merged into upstream 1.8.13.
  Unfortunately Bionic is on 1.8.8-1ubuntu0.4 and doesn't have this patch.
  
  Please consider pulling this patch into an SRU for Bionic.


-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to haproxy in Ubuntu.
https://bugs.launchpad.net/bugs/1848902

Title:
  haproxy in bionic can get stuck

Status in haproxy package in Ubuntu:
  Fix Released
Status in haproxy source package in Bionic:
  Triaged

Bug description:
  [Impact]

   * The master process will exit with the status of the last worker. 
     When the worker is killed with SIGTERM, it is expected to get 143 as an
     exit status. Therefore, we consider this exit status as normal from a
     systemd point of view. If it happens when not stopping, the systemd
     unit is configured to always restart, so it has no adverse effect.

   * Backport upstream fix - adding another accepted RC to the systemd 
     service

  [Test Case]

   * You want to install haproxy and have it running. Then sigterm it a lot.
     With the fix it would restart the service all the time, well except 
     restart limit. But in the bad case it will just stay down and didn't 
     even try to restart it.

     $ apt install haproxy
     $ for x in {1..100}; do pkill -TERM -x haproxy ; sleep 0.1 ; done
     $ systemctl status haproxy

  [Regression Potential]

   * This eventually is a conffile modification, so if there are other 
     modifications done by the user they will get a prompt. But that isn't a 
     regression. I checked the code and I can't think of another RC=143 that 
     would due to that "no more" detected as error. I really think other 
     than the update itself triggering a restart (as usual for services) 
     there is no further regression potential to this.

  [Other Info]
   
   * Fix already active in IS hosted cloud without issues since a while
   * Also reports (comment #5) show that others use this in production as 
     well

  ---

  On a Bionic/Stein cloud, after a network partition, we saw several
  units (glance, swift-proxy and cinder) fail to start haproxy, like so:

  root@juju-df624b-6-lxd-4:~# systemctl status haproxy.service
  ● haproxy.service - HAProxy Load Balancer
     Loaded: loaded (/lib/systemd/system/haproxy.service; enabled; vendor 
preset: enabled)
     Active: failed (Result: exit-code) since Sun 2019-10-20 00:23:18 UTC; 1h 
35min ago
       Docs: man:haproxy(1)
             file:/usr/share/doc/haproxy/configuration.txt.gz
    Process: 2002655 ExecStart=/usr/sbin/haproxy -Ws -f $CONFIG -p $PIDFILE 
$EXTRAOPTS (code=exited, status=143)
    Process: 2002649 ExecStartPre=/usr/sbin/haproxy -f $CONFIG -c -q $EXTRAOPTS 
(code=exited, status=0/SUCCESS)
   Main PID: 2002655 (code=exited, status=143)

  Oct 20 00:16:52 juju-df624b-6-lxd-4 systemd[1]: Starting HAProxy Load 
Balancer...
  Oct 20 00:16:52 juju-df624b-6-lxd-4 systemd[1]: Started HAProxy Load Balancer.
  Oct 20 00:23:18 juju-df624b-6-lxd-4 systemd[1]: Stopping HAProxy Load 
Balancer...
  Oct 20 00:23:18 juju-df624b-6-lxd-4 haproxy[2002655]: [WARNING] 292/001652 
(2002655) : Exiting Master process...
  Oct 20 00:23:18 juju-df624b-6-lxd-4 haproxy[2002655]: [ALERT] 292/001652 
(2002655) : Current worker 2002661 exited with code 143
  Oct 20 00:23:18 juju-df624b-6-lxd-4 haproxy[2002655]: [WARNING] 292/001652 
(2002655) : All workers exited. Exiting... (143)
  Oct 20 00:23:18 juju-df624b-6-lxd-4 systemd[1]: haproxy.service: Main process 
exited, code=exited, status=143/n/a
  Oct 20 00:23:18 juju-df624b-6-lxd-4 systemd[1]: haproxy.service: Failed with 
result 'exit-code'.
  Oct 20 00:23:18 juju-df624b-6-lxd-4 systemd[1]: Stopped HAProxy Load Balancer.
  root@juju-df624b-6-lxd-4:~#

  The Debian maintainer came up with the following patch for this:

    https://www.mail-archive.com/haproxy@formilux.org/msg30477.html

  Which was added to the 1.8.10-1 Debian upload and merged into upstream 1.8.13.
  Unfortunately Bionic is on 1.8.8-1ubuntu0.4 and doesn't have this patch.

  Please consider pulling this patch into an SRU for Bionic.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/haproxy/+bug/1848902/+subscriptions

_______________________________________________
Mailing list: https://launchpad.net/~ubuntu-ha
Post to     : ubuntu-ha@lists.launchpad.net
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp

[Ubuntu-ha] [Bug 1848902] Re: haproxy in bionic can get stuck

Reply via email to