Hello,

I am trying to find a solution to a new issue I started having when trying to install Debian 11 Bullseye over FAI.

My disk config is as follows:

disk_config sda disklabel:gpt fstabkey:device bootable:1
primary /boot/efi   512M     vfat   rw
primary -   1G    -   -
primary -   10240   -   -
primary -       0-      -       -

disk_config sdb disklabel:gpt fstabkey:device bootable:1
primary -   512M     -   -
primary -   1G    -   -
primary -   10240   -   -
primary -       0-      -       -

disk_config raid fstabkey:uuid
raid1   swap    sda2,sdb2   swap    sw
raid1   /       sda3,sdb3   ext4    noatime,errors=remount-ro mdcreateopts="--metadata=0.90"
raid1   -       sda4,sdb4       -   -

disk_config lvm fstabkey:uuid
vg disk_tmp    md2
disk_tmp-var          /var                      20G     ext4 noatime,errors=remount-ro disk_tmp-home         /home                     50G     ext4 noatime,errors=remount-ro

disk_config tmpfs
tmpfs   /tmp    RAM:10% defaults

This exact config worked fine installing Debian 10 Buster, but with Bullseye, if there already is LVM metadata at the beginning of the raid partitions, then after initializing the raid array, pvcreate fails, citing it cannot get an exclusive lock on the target device:

Executing: yes | mdadm --create  /dev/md0 --level=raid1 --force --run --raid-devices=2 /dev/sdb2 /dev/sda2
Executing: echo frozen > /sys/block/md0/md/sync_action
Executing: mkswap  /dev/md0
Executing: yes | mdadm --create --metadata=0.90 /dev/md1 --level=raid1 --force --run --raid-devices=2 /dev/sdb3 /dev/sda3
Executing: echo frozen > /sys/block/md1/md/sync_action
Executing: mkfs.ext4  /dev/md1
Executing: yes | mdadm --create  /dev/md2 --level=raid1 --force --run --raid-devices=2 /dev/sdb4 /dev/sda4
Executing: echo frozen > /sys/block/md2/md/sync_action
Executing: pvcreate -ff -y  /dev/md2
pvcreate -ff -y  /dev/md2 had exit code 5
(STDERR)   Can't open /dev/md2 exclusively.  Mounted filesystem?
(STDERR)   Can't open /dev/md2 exclusively.  Mounted filesystem?
Command had non-zero exit code

Running lsblk after the error shows the probable cause:
NAME                           MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                              8:0    0  100G  0 disk
|-sda1                           8:1    0  512M  0 part
|-sda2                           8:2    0    1G  0 part
| `-md0                          9:0    0 1022M  0 raid1
|-sda3                           8:3    0   10G  0 part
| `-md1                          9:1    0   10G  0 raid1
`-sda4                           8:4    0 88.5G  0 part
  `-md2                          9:2    0 88.4G  0 raid1
    |-shinstall--test_vg0-var  254:0    0   20G  0 lvm
    `-shinstall--test_vg0-home 254:1    0   50G  0 lvm
sdb                              8:16   0  100G  0 disk
|-sdb1                           8:17   0  512M  0 part
|-sdb2                           8:18   0    1G  0 part
| `-md0                          9:0    0 1022M  0 raid1
|-sdb3                           8:19   0   10G  0 part
| `-md1                          9:1    0   10G  0 raid1
`-sdb4                           8:20   0 88.5G  0 part
  `-md2                          9:2    0 88.4G  0 raid1
    |-shinstall--test_vg0-var  254:0    0   20G  0 lvm
    `-shinstall--test_vg0-home 254:1    0   50G  0 lvm

There already were logical volumes defined on the disk before the installation, and as I only have one initial disk layout for software raid servers, none of the pre-existing metadata got overwritten, leading to an automatic activation after the array got initialized. Running "vgchange -an" to disable the pre-existing volume group then re-running pvcreate fixes the issue, however, I have no way to inject the vgchange call between the SWRaid array creation and LVM physical volume creation.

I tried adding a partition hook to run wipefs and DD on the disks to wipe all previous signatures:

vgchange -an
for disk in $disklist
do
    if [ ! -e "/dev/$disk" ]
    then
        continue
    fi
    echo "Wiping /dev/$disk"
    wipefs -a -f "/dev/$disk"
    dd if=/dev/zero of="/dev/$disk" bs=1024 count=20480
done
exit 0

However - wipefs doesn't check any further than the initial superblock, while the offending LVM signature is at the beginning of a partition further on the disk. So aside from either performing some sort of offset calculations on the fly then DD'ing the section in a pre-partition hook, I don't see how to solve this issue.

Any other ideas how to solve this? Did anyone else encounter this issue?

Antwort per Email an