Package: multipath-tools Version: 0.5.0-6+deb8u2 Severity: important * What led up to the situation?
Upgrade working wheezy system to jessie Apply patch to fix multipath segfault, see #751993 The system boots off an internal physical disk That disk has one / partition and an LVM partition /usr is one of the LVs on that disk There are two other internal disks, LVM is not used on these. The multipath devices are connected via a qlogic ISP2432-based card, through a FC switch to two Promise VTrak units. LVM is not used on the multipath devices. * What exactly did you do (or not do) that was effective (or ineffective)? Cold-plug FC connection to external storage, boot system. * What was the outcome of this action? multipath -l shows no devices. no related device maps in /dev/mapper. The FC and SCSI layers all worked fine, I see lots of /dev/sdX devices. multipath -l -v3 shows the /dev/sdX devices as blacklisted ... sdd: blacklisted, udev property missing sde: blacklisted, udev property missing sdp: blacklisted, udev property missing ...etc * What outcome did you expect instead? usable multipaths to the configured devices /dev/mapper populated, including kpartx partitions Related notes: On some reboots the system log shows multipath timing out. Below is 'sdd'. The timeout occurs 33 seconds after the disk was attached. I was unable to determine the cause of this or reproduce consistently. Nov 11 11:11:53 kernel: sd 1:0:0:4: [sdd] 25769805824 512-byte logical blocks: (13.1 TB/12.0 TiB) Nov 11 11:11:53 kernel: sd 1:0:0:4: [sdd] Write Protect is off Nov 11 11:11:53 kernel: sd 1:0:0:4: [sdd] Mode Sense: 97 00 10 08 Nov 11 11:11:53 kernel: sd 1:0:0:4: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA Nov 11 11:11:53 kernel: sdd: sdd1 Nov 11 11:11:53 kernel: sd 1:0:0:4: [sdd] Attached SCSI disk ... Nov 11 11:12:25 systemd-udevd[346]: timeout '/sbin/multipath -v0 /dev/sdd' Nov 11 11:12:26 systemd-udevd[346]: timeout: killing '/sbin/multipath -v0 /dev/sdd' [459] Nov 11 11:12:26 systemd-udevd[346]: '/sbin/multipath -v0 /dev/sdd' [459] terminated by signal 9 (Killed) On some reboots, there was a bad interaction between LVM and multipathd and/or udev. On the console systemd showed it was waiting for tasks to complete for both of these. device-mapper would try to handle the multipath devices before the LVM ones, which sometimes caused the system to fail to boot; it went into emergency mode. I was unable to determine the cause of this or reproduce it consistently. I don't know where the multipath-tools-boot line comes from, that package is not even installed. You can however see the two running contemporaneously # journalctl |egrep -i -e '(multipath|lvm|-udev)' Nov 11 16:52:57 systemd[1]: Starting LVM2 metadata daemon socket. Nov 11 16:52:57 systemd[1]: Listening on LVM2 metadata daemon socket. Nov 11 16:52:57 systemd-udevd[317]: starting version 215 Nov 11 16:52:58 systemd[1]: Starting LSB: early multipath boot script... Nov 11 16:52:58 kernel: device-mapper: multipath: version 1.7.0 loaded Nov 11 16:52:59 multipath-tools-boot[633]: Discovering and coalescing multipaths...done. Nov 11 16:52:59 systemd[1]: Started LSB: early multipath boot script. Nov 11 16:52:59 systemd[1]: Starting system-lvm2\x2dpvscan.slice. Nov 11 16:52:59 systemd[1]: Created slice system-lvm2\x2dpvscan.slice. Nov 11 16:52:59 systemd[1]: Starting LVM2 PV scan on device 8:2... Nov 11 16:52:59 systemd[1]: Starting Activation of LVM2 logical volumes... Nov 11 16:52:59 systemd[1]: Started LVM2 PV scan on device 8:2. Nov 11 16:52:59 lvm[676]: 10 logical volume(s) in volume group "testbox" now active Nov 11 16:53:00 systemd[1]: Started Activation of LVM2 logical volumes. Nov 11 16:53:00 systemd[1]: Starting Activation of LVM2 logical volumes... Nov 11 16:53:01 lvm[854]: 10 logical volume(s) in volume group "testbox" now active Nov 11 16:53:01 systemd[1]: Started Activation of LVM2 logical volumes. Nov 11 16:53:01 systemd[1]: Starting Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling... Nov 11 16:53:01 lvm[928]: 10 logical volume(s) in volume group "testbox" monitored Nov 11 16:53:01 systemd[1]: Started Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling. Nov 11 16:53:09 systemd[1]: Starting LSB: multipath daemon... Nov 11 16:53:10 multipath-tools[1190]: Starting multipath daemon: multipathd. Nov 11 16:53:10 systemd[1]: Started LSB: multipath daemon. Nov 11 16:53:10 multipathd[1288]: path checkers start up Further work: First I applied the patch discussed in #799781 (shared lock with udev). This didn't fix things but may have helped. Then after reviewing #782487, I built sg3-utils v1.42 and installed it, including sg3-utils-udev. This got the system working. hot-plugging the fibre doesn't work properly but that will have to wait for another bug. I reverted the shared lock patch, to test if that is essential. It appears not - I was able to boot the system fine as long as I had the sg3-utils packages installed. Nonetheless it seems worth including it as I did notice that multipath-tools and LVM were trying to do things at the same time (systemd was waiting for both). Notice also the dm ordering: # dmsetup ls |sort -t: -k2,2 -n |column -t testbox-swap_1 (254:0) testbox-usr (254:1) vt04-ld4 (254:2) vt05-ld3-atoa (254:3) vt05-ld4-atoa (254:4) vt05-ld5-atoa (254:5) vt04-ld4-part1 (254:6) vt05-ld4-atoa-part1 (254:7) vt05-ld5-atoa-part1 (254:8) testbox-var (254:9) testbox-var+log (254:10) testbox-tmp (254:11) testbox-opt (254:12) testbox-local (254:13) testbox-srv (254:14) testbox-data (254:15) testbox-srv+jenkins (254:16) Requests: Can we please have sg3-utils v1.42 added to a stable point release? Also multipath-tools needs to depend on sg3-utils-udev. It seems a shame to not include the shared lock patch as it avoids a known deadlock and the system still works fine with it included. -- Package-specific info: Contents of /etc/multipath.conf: blacklist { devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*" devnode "^hd[a-z][[0-9]*]" devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]" device { vendor MegaRAID } device { vendor APPLE } device { vendor ATA } device { vendor DELL } device { vendor Dell } } devices { device { vendor "Promise" product "VTrak" path_grouping_policy multibus getuid_callout "/lib/udev/scsi_id --whitelisted --replace-whitespace --device /dev/%n" path_checker readsector0 path_selector "round-robin 0" hardware_handler "0" failback immediate rr_weight uniform rr_min_io 100 no_path_retry 20 features "1 queue_if_no_path" product_blacklist "VTrak V-LUN" } } multipaths { multipath { wwid 22258000155e916fb alias vt05-ld3-atoa } multipath { wwid 22268000155a61f7d alias vt05-ld4-atoa } multipath { wwid 222e8000155286d15 alias vt05-ld5-atoa } multipath { wwid 22290000155c7e34a alias vt04-ld4 } } -- System Information: Debian Release: 8.6 APT prefers stable APT policy: (990, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 3.16.0-4-amd64 (SMP w/12 CPU cores) Locale: LANG=en_AU.UTF-8, LC_CTYPE=en_AU.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages multipath-tools depends on: ii initscripts 2.88dsf-59 ii kpartx 0.5.0-6+deb8u2 ii libaio1 0.3.110-1 ii libc6 2.19-18+deb8u6 ii libdevmapper1.02.1 2:1.02.90-2.2+deb8u1 ii libgcc1 1:4.9.2-10 ii libreadline6 6.3-8+b3 ii libudev1 215-17+deb8u5 ii lsb-base 4.1+Debian13+nmu1 ii udev 215-17+deb8u5 multipath-tools recommends no packages. Versions of packages multipath-tools suggests: pn multipath-tools-boot <none> -- no debconf information