Copilot commented on code in PR #13090:
URL: https://github.com/apache/cloudstack/pull/13090#discussion_r3172805548


##########
agent/conf/agent.properties:
##########
@@ -310,6 +310,22 @@ iscsi.session.cleanup.enabled=false
 # This parameter specifies if the host must be rebooted when something goes 
wrong with the heartbeat.
 #reboot.host.and.alert.management.on.heartbeat.timeout=true
 
+# Action taken by kvmheartbeat.sh / kvmspheartbeat.sh when a storage heartbeat
+# write fails persistently. Supersedes the legacy binary
+# 'reboot.host.and.alert.management.on.heartbeat.timeout' when set to a 
non-default value.
+#
+# Allowed values:
+#   reboot          - immediate sysrq-trigger reboot (default; original 
behavior)
+#   graceful-reboot - 'systemctl reboot' instead of sysrq; allows VMs to stop 
cleanly
+#   restart-agent   - restart cloudstack-agent only; running VMs are preserved
+#   log-only        - log + alert; take no automatic action (admin must 
investigate)

Review Comment:
   The `log-only` description says “log + alert”, but the heartbeat scripts 
only write to syslog via `logger` and exit 0; there is no explicit 
management-server alerting performed by this action. Consider rewording to 
avoid implying an alert will be generated (or document exactly what “alert” 
means here).
   



##########
agent/conf/agent.properties:
##########
@@ -310,6 +310,22 @@ iscsi.session.cleanup.enabled=false
 # This parameter specifies if the host must be rebooted when something goes 
wrong with the heartbeat.
 #reboot.host.and.alert.management.on.heartbeat.timeout=true
 
+# Action taken by kvmheartbeat.sh / kvmspheartbeat.sh when a storage heartbeat
+# write fails persistently. Supersedes the legacy binary
+# 'reboot.host.and.alert.management.on.heartbeat.timeout' when set to a 
non-default value.
+#

Review Comment:
   The comment says this setting “supersedes” 
`reboot.host.and.alert.management.on.heartbeat.timeout`, but in `KVMHAMonitor` 
that boolean still gates whether the heartbeat script is invoked in fence mode 
at all. If the boolean is set to `false`, changing `kvm.heartbeat.fence.action` 
has no effect. Please reword to clarify the precedence/interaction (e.g., the 
boolean remains a global bypass, and this setting only changes the action when 
fencing is enabled).
   



##########
agent/src/main/java/com/cloud/agent/properties/AgentProperties.java:
##########
@@ -598,6 +598,25 @@ public class AgentProperties{
     public static final Property<Boolean> 
REBOOT_HOST_AND_ALERT_MANAGEMENT_ON_HEARTBEAT_TIMEOUT
         = new 
Property<>("reboot.host.and.alert.management.on.heartbeat.timeout", true);
 
+    /**
+     * Action taken by the KVM agent's storage heartbeat scripts 
(kvmheartbeat.sh / kvmspheartbeat.sh)
+     * when a heartbeat write fails persistently. Allowed values:
+     * <ul>
+     *   <li>{@code reboot} (default) — immediate sysrq-trigger reboot; 
original behavior</li>
+     *   <li>{@code graceful-reboot} — {@code systemctl reboot} instead of 
sysrq, lets VMs stop cleanly</li>
+     *   <li>{@code restart-agent} — restart cloudstack-agent only; running 
VMs preserved</li>
+     *   <li>{@code log-only} — log + alert, no automatic action</li>

Review Comment:
   The Javadoc lists `log-only` as “log + alert”, but the implementation in the 
heartbeat scripts only logs locally (syslog) and exits successfully; it doesn’t 
directly generate a management-server alert. Please adjust the wording (or 
clarify what alerting mechanism is expected) to avoid misleading operators.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to