[Bug 1881762] Re: resource timeout not respecting units

Rafael David Tinoco Fri, 25 Sep 2020 14:01:46 -0700

** Description changed:

+ [Impact]
+ 
+  * Cluster resource timeouts are not working and should be working.
+ Timeouts are important in order for the actions (done for the resource)
+ don't timeout before we're expecting (sometimes starting a resource can
+ take more time than the default time because of configuration files, or
+ cache to be loaded, etc).
+ 
+ [Test Case]
+ 
+  * Create a pacemaker cluster with Ubuntu focal and configure a
+ primitive with:
+ 
+ primitive haproxy systemd:haproxy \
+         op monitor interval=2s \
+         op start interval=0s timeout=500s \
+         op stop interval=0s timeout=500s \
+         meta migration-threshold=2
+ 
+ or even
+ 
+ primitive haproxy systemd:haproxy \
+         op monitor interval=2s \
+         op start interval=0s timeout=500 \
+         op stop interval=0s timeout=500 \
+         meta migration-threshold=2
+ 
+ and observe timeouts are not being respected.
+ 
+ [Regression Potential]
+ 
+  * The number of patches are not small but they're ALL related to the
+ same thing: fixing timeout not working and re-organizing timing for
+ resources.
+ 
+  * TBD (more info to come)
+ 
+ [Other Info]
+  
+  * Original Description (from the reporter):
+ 
  While working on pacemaker, i discovered a issue with timeouts
  
  haproxy_stop_0 on primary 'OCF_TIMEOUT' (198): call=583, status='Timed
  Out', exitreason='', last-rc-change='1970-01-04 17:21:18 -05:00',
  queued=44ms,      exec=176272ms
  
  this lead me down the path of finding that setting a timeout unit value
  was not doing anything
  
  primitive haproxy systemd:haproxy \
-         op monitor interval=2s \
-         op start interval=0s timeout=500s \
-         op stop interval=0s timeout=500s \
-         meta migration-threshold=2
+         op monitor interval=2s \
+         op start interval=0s timeout=500s \
+         op stop interval=0s timeout=500s \
+         meta migration-threshold=2
  
  primitive haproxy systemd:haproxy \
-         op monitor interval=2s \
-         op start interval=0s timeout=500 \
-         op stop interval=0s timeout=500 \
-         meta migration-threshold=2
+         op monitor interval=2s \
+         op start interval=0s timeout=500 \
+         op stop interval=0s timeout=500 \
+         meta migration-threshold=2
  
- the two above configs result in the same behaviour, pacemaker/crm seems to be 
ignoring the "s"
+ the two above configs result in the same behavior, pacemaker/crm seems
+ to be ignoring the "s"
+ 
  I file a bug with pacemaker itself
  https://bugs.clusterlabs.org/show_bug.cgi?id=5429
  
  but this lead to the following responsed, copied from the ticket:
  
  <<Looking back on your irc chat, I see you have a version of Pacemaker
  with a known bug:
  
  <<haproxy_stop_0 on primary 'OCF_TIMEOUT' (198): call=583, status='Timed
  Out', exitreason='', last-rc-<<change='1970-01-04 17:21:18 -05:00',
  queued=44ms,      exec=176272ms
  
  <<The incorrect date is a result of bugs that occur in systemd resources
  when Pacemaker 2.0.3 is built <<with the -UPCMK_TIME_EMERGENCY_CGT C
  flag (which is not the default). I was only aware of that being the
  <<case in one Fedora release. If those are stock Ubuntu packages, please
  file an Ubuntu bug to make sure <<they are aware of it.
  
  <<The underlying bugs are fixed as of the Pacemaker 2.0.4 release. If
  anyone wants to backport specific <<commits instead, the github pull
  requests #1992 and #1997 should take care of it.
  
  It appears the the root cause of my issue with setting timeout values
  with units ("600s") is a bug in the build process of ubuntu pacemaker
  
  1) lsb_release -d Description:    Ubuntu 20.04 LTS
  2) ii  pacemaker                            2.0.3-3ubuntu3                    
amd64        cluster resource manager
  3) setting "100s" in the timeout of a resource should result in a 100 second 
timeout, not a 100 milisecond timeout
  4) the settings unit value "s", is being ignored. force me to set the timeout 
to 10000 to get a 10 second timeout

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1881762

Title:
  resource timeout not respecting units

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1881762/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1881762] Re: resource timeout not respecting units

Reply via email to