Am 05.11.21 um 15:47 schrieb Fabian Ebner:
Am 05.11.21 um 14:12 schrieb Thomas Lamprecht:
On 05.11.21 14:06, Fabian Ebner wrote:
Since commit 277d33454f77ec1d1e0bc04e37621e4dd2424b67 in pve-qemu,
smm=off is no longer the default, but with SeaBIOS and serial display,
this can lead to a boot loop.

Reported in the community forum [0] and reproduced with a Debian 10
VM.

[0]: https://forum.proxmox.com/threads/pve-7-0-all-vms-with-cloud-init-seabios-fail-during-boot-process-bootloop-disk-not-found.97310/post-427129

Signed-off-by: Fabian Ebner <f.eb...@proxmox.com>
---
  PVE/QemuServer.pm                    | 12 ++++++++++
  test/cfg2cmd/seabios_serial.conf     | 16 ++++++++++++++
  test/cfg2cmd/seabios_serial.conf.cmd | 33 ++++++++++++++++++++++++++++
  3 files changed, 61 insertions(+)
  create mode 100644 test/cfg2cmd/seabios_serial.conf
  create mode 100644 test/cfg2cmd/seabios_serial.conf.cmd

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 763c412..9b76512 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -3403,6 +3403,16 @@ my sub get_cpuunits {
      my ($conf) = @_;
      return $conf->{cpuunits} // (PVE::CGroup::cgroup_mode() == 2 ? 100 : 1024);
  }
+
+# Since commit 277d33454f77ec1d1e0bc04e37621e4dd2424b67 in pve-qemu, smm is not off by default +# anymore. But smm=off seems to be required when using SeaBIOS and serial display.
+my sub should_disable_smm {
+    my ($conf, $vga) = @_;
+
+    return (!defined($conf->{bios}) || $conf->{bios} eq 'seabios') &&
+    $vga->{type} && $vga->{type} =~ m/^serial\d+$/;
+}
+
  sub config_to_command {
      my ($storecfg, $vmid, $conf, $defaults, $forcemachine, $forcecpu,
          $pbs_backing) = @_;
@@ -4002,6 +4012,8 @@ sub config_to_command {
      push @$machineFlags, 'accel=tcg';
      }
+    push @$machineFlags, 'smm=off' if should_disable_smm($conf, $vga);

doesn't that breaks live migration or do we know that it could never work with smm=on under seabios + display=serial, so that there can no running VM that could be live
migrated?

No, we can't say that. I found that the same configuration does work with CentOS 7.



Stefan tested migration in both direction when the pve-qemu patch was dropped [0], but he did sound more confident about the smm=off -> smm=on direction. And now the scenario Wolfgang mentioned would actually be relevant, because we can get into a situation where smm=off -> smm=on -> smm=off by migrating from older -> old -> new. I'll test around some more and see if I can break it.

[0]: https://lists.proxmox.com/pipermail/pve-devel/2021-August/049778.html


With the following nodes:

N1: pve-qemu 6.0.0-3 -> smm=off
N2: pve-qemu 6.1.0-1 -> smm=on
N3: patched qemu-server to always turn smm off -> smm=off

I tested:

1. Migration N1 -> N2
2. soft reboot
3. migration N2 -> N3
4. soft reboot

For Windows 10 and CentOS 7 everything seemed to work. I didn't test Windows with serial output as I don't know how to set that up.

Still, the same might not be true for more complex (virtual) hardware setups.

For the problematic Debian 10 VM, step 2 exhibits the problem, i.e. leading to a boot loop. When I skipped step 2, it seemed to work too.

If there is a problem with live migration smm=on -> smm=off, it's at least not obvious and would only affect VMs running pve-qemu >= 6.0.0-4 (and SeaBIOS with serial display). On the other hand, the patch does fix a user-reported regression from pve-qemu 6.0.0-4 (although it's also not clear how widespread it is).



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel




_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Reply via email to