Tested this by setting up an LVM volume group on a shared LUN (via iSCSI) according to the Proxmox multipath docs [0]. Replicated the steps to reproduce given in Friedrich's comment in the bug ticket on a 2-node cluster and was able to reproduce the original problems. With the patches applied, the following scenario worked as I would expect it to: - Created VM with ID 100 on shared LVM on node 1 - Restarted node 2 - Deleted VM 100 - Created VM 100 on node 2
Checked that: - Device mapper device is not created on node 2 (output of `dmsetup ls | grep 100` shows no results) during the reboot - Creation of new VM 100 on node 2 is successful Additionally also checked the migration scenario on the same 2-node cluster: - Created VM with ID 100 on shared LVM on node 1 - Rebooted node 2 - Created backup of VM 100 - Destroyed VM 100 - Restored VM 100 from backup on node 1 - Started VM 100 - Live migrated VM 100 from node 1 to node 2 Checked that: - Again, device mapper device is not created on reboot of node 2 - After restoring VM 100 on node 1 and starting it, the inactive LV exists on node 2 - VM 100 is successfully live-migrated to node 2, the previously inactive LV is set to active Also ran the pve8to9 script: - The script correctly detected that the LVM storage contained guest volumes with autoactivation enabled - After running `pve8to9 updatelvm`, lvs reported autoactivation disabled for all volumes on the node Consider this: Tested-by: Michael Köppl <m.koe...@proxmox.com> Also had a look at the changes to the best of my abilities, taking the previous discussions from v1 and v2 into account. Apart from one suggestion I added to the pve-manager 3/3 patch, the changes look good to me. So please also consider this: Reviewed-by: Michael Köppl <m.koe...@proxmox.com> [0] https://pve.proxmox.com/wiki/Multipath On 4/29/25 13:36, Friedrich Weber wrote: > # Summary > > With default settings, LVM autoactivates LVs when it sees a new VG, e.g. after > boot or iSCSI login. In a cluster with guest disks on a shared LVM VG (e.g. on > top of iSCSI/Fibre Channel (FC)/direct-attached SAS), this can indirectly > cause > guest creation or migration to fail. See bug #4997 [1] and patch #2 for > details. > > The primary goal of this series is to avoid autoactivating thick LVs that hold > guest disks in order to fix #4997. For this, it patches the LVM storage plugin > to create new LVs with autoactivation disabled, and implements a pve8to9 check > and subcommand to disable autoactivation on existing LVs (see below for > details). > > The series does the same for LVM-thin storages. While LVM-thin storages are > inherently local and cannot run into #4997, it can still make sense to avoid > unnecessarily activating thin LVs at boot. > > This series should only be applied for PVE 9, see below. > > Marked as RFC to get feedback on the general approach and some details, see > patches #3 and #6. In any case, this series shouldn't be merged as-is, as it > adds an incomplete stub implementation of pve8to9 (see below). > > # Mixed 7/8 cluster > > Unfortunately, we need to consider the mixed-version cluster between PVE 7 and > PVE 8 because PVE 7/Bullseye's LVM does not know `--setautoactivation`. A user > upgrading from PVE 7 will temporarily have a mixed 7/8 cluster. Once this > series is applied, the PVE 8 nodes will create new LVs with > `--setautoactivation n`, which the PVE 7 nodes do not know. In my tests, the > PVE 7 nodes can read/interact with such LVs just fine, *but*: As soon as a PVE > 7 node creates a new (unrelated) LV, the `--setautoactivation n` flag is reset > to default `y` on *all* LVs of the VG. I presume this is because creating a > new > LV rewrites metadata, and the PVE 7 LVM doesn't write out the > `--setautoactivation n` flag. I imagine (have not tested) this will cause > problems on a mixed cluster. > > Hence, as also discussed in v2, we should only apply this series for PVE 9, as > we can be sure all nodes are running at least PVE 8 then. > > # pve8to9 script > > As discussed in v2, this series implements > > (a) a pve8to9 check to detect thick and thin LVs with autoactivation enabled > (b) a script to disable autoactivation on LVs when needed, intended to be run > manually by the user during 8->9 upgrade > > pve8to9 doesn't exist yet, so patch #4 adds a stub implementation to have a > basis for (a) and (b). We naturally don't have to go with this implementation, > I'm happy to rebase once pve8to9 exists. > > Patch #5 moves the existing checks from `pve8to9` to `pve8to9 checklist`, to > be > able to implement (b) as a new subcommand. I realize this is a huge > user-facing > change, and we don't have to go with this approach. It is also incomplete, as > patch #5 doesn't update the manpage. I included this to have a basis for the > next patch. > > Patch #6 implements a pve8to9 subcommand for (b), but this can be moved to a > dedicated script of course. Documentation for the new subcommand is missing. > > # Bonus fix for FC/SAS multipath+LVM issue > > As it turns out, this series seems to additionally fix an issue on hosts with > LVM on FC/SAS-attached LUNs *with multipath* where LVM would report "Device > mismatch detected" warnings because the LVs are activated too early in the > boot > process before multipath is available. Our current suggested workaround is to > install multipath-tools-boot [2]. With this series applied and when users have > upgraded to 9, this shouldn't be necessary anymore, as LVs are not > auto-activated after boot. > > # Interaction with zfs-initramfs > > zfs-initramfs used to ship an an initramfs-tools script that unconditionally > activates *all* VGs that are visible at boot time, ignoring the autoactivation > flag. A fix was already applied in v2 [3]. > > # Patch summary > > - Patch #1 is preparation > - Patch #2 makes the LVM plugin create new LVs with `--setautoactivation n` > - Patch #3 makes the LVM-thin plugin disable autoactivation for new LVs > - Patch #4 creates a stub pve8to9 script (see pve8to9 section above) > - Patch #5 moves pve8to9 checks to a subcommand (see pve8to9 section above) > - Patch #6 adds a pve8to9 subcommand to disable autoactivation > (see pve8ot9 section above) > > # Changes since v2 > > - drop zfsonlinux patch that was since applied > - add patches for LVM-thin > - add pve8to9 patches > > v2: > https://lore.proxmox.com/pve-devel/20250307095245.65698-1-f.we...@proxmox.com/ > v1: > https://lore.proxmox.com/pve-devel/20240111150332.733635-1-f.we...@proxmox.com/ > > [1] https://bugzilla.proxmox.com/show_bug.cgi?id=4997 > [2] > https://pve.proxmox.com/mediawiki/index.php?title=Multipath&oldid=12039#%22Device_mismatch_detected%22_warnings > [3] > https://lore.proxmox.com/pve-devel/ad4c806c-234a-4949-885d-8bb369860...@proxmox.com/ > > pve-storage: > > Friedrich Weber (3): > lvm: create: use multiple lines for lvcreate command line > fix #4997: lvm: create: disable autoactivation for new logical volumes > lvmthin: disable autoactivation for new logical volumes > > src/PVE/Storage/LVMPlugin.pm | 10 +++++++++- > src/PVE/Storage/LvmThinPlugin.pm | 18 +++++++++++++++++- > 2 files changed, 26 insertions(+), 2 deletions(-) > > > pve-manager stable-8: > > Friedrich Weber (3): > cli: create pve8to9 script as a copy of pve7to8 > pve8to9: move checklist to dedicated subcommand > pve8to9: detect and (if requested) disable LVM autoactivation > > PVE/CLI/Makefile | 1 + > PVE/CLI/pve8to9.pm | 1695 ++++++++++++++++++++++++++++++++++++++++++++ > bin/Makefile | 12 +- > bin/pve8to9 | 8 + > 4 files changed, 1715 insertions(+), 1 deletion(-) > create mode 100644 PVE/CLI/pve8to9.pm > create mode 100755 bin/pve8to9 > > > Summary over all repositories: > 6 files changed, 1741 insertions(+), 3 deletions(-) > _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel