On 20.03.2024 09:59, Jan Beulich wrote:
> On 19.03.2024 21:35, Andrew Cooper wrote:
>> On 25/01/2024 2:12 pm, Jan Beulich wrote:
>>> "watchdog_timeout=0" is documented to disable the watchdog. Make sure
>>> this also is true when there's a subsequent "watchdog" command line
>>> option (and no further "watchdog_timeout=" one).
>>
>> We also document that latest takes precedence, at which point "watchdog"
>> would re-activate.
> 
> True,

Actually - no. Latest takes precedence doesn't matter here. "watchdog"
following "watchdog_timeout=0" is simply asking to enable the watchdog
with a timeout of 0, meaning infinity in practice. Which still is as
good as "watchdog=off".

> so perhaps ...
> 
>>> While there also switch watchdog_setup() to returning void, bringing it
>>> in line with the !CONFIG_WATCHDOG case. Further amend command line
>>> documentation to also mention the implicit effect of specifying a non-
>>> zero timeout.
>>>
>>> Signed-off-by: Jan Beulich <jbeul...@suse.com>
>>> ---
>>> Alternatively "watchdog" following "watchdog_timeout=0" could be taken
>>> to mean to use the default timeout again.
> 
> ... this alternative wants following.
> 
>> I realise that watchdog_timeout is my fault, but in fairness it was an
>> early change of mine in Xen and didn't exactly get the kind of review it
>> would get these days.  It also wasn't used by XenServer in the end - we
>> just stayed at a default 5s.
>>
>> I'm very tempted to suggest deleting watchdog_timeout, and extending
>> watchdog= to have `force | <bool> | <int>s` so you could specify e.g.
>> `watchdog=10s`.

This being a set of alternatives also isn't quite right. "force" needs to
be possible to combine with a timeout value. Yet if we make it "List of",
which I was ...

>> The watchdog is off by default so I don't expect this will impact
>> people.  It is also more convenient for the end user, and means that we
>> don't have have the current split approach of two separate options
>> fighting for control over each other.
> 
> While I'd be happy to fold the two options, I don't think the watchdog
> being off by default is relevant here. People using just the
> watchdog_timeout= option with a non-zero value will already have the
> watchdog enabled. They'd need to pay attention to an eventual CHANGELOG
> entry and change their command line.
> 
> Furthermore consolidating the two options isn't going to remove any
> of the problems. What effect would e.g. "watchdog=off,10s" have? The
> principle of "latest takes precedence" assigns clear meaning to
> "watchdog=off watchdog=10s", but the above remains as ambiguous as
> e.g. "watchdog=force,0s". I'd be inclined to follow those to the
> letter, i.e. "watchdog=off,10s" sets the timeout to 10 but disables
> the watchdog while "watchdog=force,0s" simply results in a non-
> functioning watchdog (due to 0s effectively meaning 4 billion seconds
> and hence for all practical purposes "never").

... assuming anyway (despite you having it written differently), we'll
have said problems again. So perhaps

<bool> | List of [ force | <int>s ]

with a timeout of 0 disabling the watchdog and a non-zero one enabling it?

Jan

Reply via email to