Launchpad has imported 9 comments from the remote bug at
https://bugzilla.redhat.com/show_bug.cgi?id=1701234.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2019-04-18T12:56:02+00:00 rmetrich wrote:

Description of problem:

The blk-availability.service unit is activated automatically when multipathd is 
enabled, even if multipathd is finally not used.
This leads to the blk-availability service to unmount file systems too early, 
breaking unit ordering and leading to shutdown issues of custom services 
requiring some mount points.


Version-Release number of selected component (if applicable):

device-mapper-1.02.149-10.el7_6.3.x86_64


How reproducible:

Always


Steps to Reproduce:

1. Enable multipathd even though there is no multipath device

  # yum -y install device-mapper-multipath
  # systemctl enable multipathd --now

2. Create a custom mount point "/data"

  # lvcreate -n data -L 1G rhel
  # mkfs.xfs /dev/rhel/data
  # mkdir /data
  # echo "/dev/mapper/rhel-data /data xfs defaults 0 0" >> /etc/fstab
  # mount /data

3. Create a custom service requiring mount point "/data"

  # cat > /etc/systemd/system/my.service << EOF
[Unit]
RequiresMountsFor=/data

[Service]
ExecStart=/bin/bash -c 'echo "STARTING"; mountpoint /data; true'
ExecStop=/bin/bash -c 'echo "STOPPING IN 5 SECONDS"; sleep 5; mountpoint /data; 
true'
Type=oneshot
RemainAfterExit=true

[Install]
WantedBy=default.target
EOF
  # systemctl daemon-reload
  # systemctl enable my.service --now

4. Set up persistent journal and reboot

  # mkdir -p /var/log/journal
  # systemctl restart systemd-journald
  # reboot

5. Check the previous boot's shutdown

  # journalctl -b -1 -o short-precise -u my.service -u data.mount -u
blk-availability.service

Actual results:

-- Logs begin at Thu 2019-04-18 12:48:12 CEST, end at Thu 2019-04-18 13:35:50 
CEST. --
Apr 18 13:31:46.933571 vm-blkavail7 systemd[1]: Started Availability of block 
devices.
Apr 18 13:31:48.452326 vm-blkavail7 systemd[1]: Mounting /data...
Apr 18 13:31:48.509633 vm-blkavail7 systemd[1]: Mounted /data.
Apr 18 13:31:48.856228 vm-blkavail7 systemd[1]: Starting my.service...
Apr 18 13:31:48.894419 vm-blkavail7 bash[2856]: STARTING
Apr 18 13:31:48.930270 vm-blkavail7 bash[2856]: /data is a mountpoint
Apr 18 13:31:48.979457 vm-blkavail7 systemd[1]: Started my.service.
Apr 18 13:35:02.544999 vm-blkavail7 systemd[1]: Stopping my.service...
Apr 18 13:35:02.547811 vm-blkavail7 systemd[1]: Stopping Availability of block 
devices...
Apr 18 13:35:02.639325 vm-blkavail7 bash[3393]: STOPPING IN 5 SECONDS
Apr 18 13:35:02.760043 vm-blkavail7 blkdeactivate[3395]: Deactivating block 
devices:
Apr 18 13:35:02.827170 vm-blkavail7 blkdeactivate[3395]: [SKIP]: unmount of 
rhel-swap (dm-1) mounted on [SWAP]
Apr 18 13:35:02.903924 vm-blkavail7 systemd[1]: Unmounted /data.
Apr 18 13:35:02.988073 vm-blkavail7 blkdeactivate[3395]: [UMOUNT]: unmounting 
rhel-data (dm-2) mounted on /data... done
Apr 18 13:35:02.988253 vm-blkavail7 blkdeactivate[3395]: [SKIP]: unmount of 
rhel-root (dm-0) mounted on /
Apr 18 13:35:03.083448 vm-blkavail7 systemd[1]: Stopped Availability of block 
devices.
Apr 18 13:35:07.693154 vm-blkavail7 bash[3393]: /data is not a mountpoint
Apr 18 13:35:07.696330 vm-blkavail7 systemd[1]: Stopped my.service.

--> We can see the following:
- blkdeactivate runs, unmounting /data, even though my.service is still running 
(hence the unexpected message "/data is not a mountpoint")


Expected results:

- my.service gets stopped
- then "data.mount" gets stopped
- finally blkdeactivate runs


Additional info:

I understand there is some chicken-and-egg problem here, but it's just
not possible to blindly unmount file systems and ignore expected unit
ordering.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1832859/comments/0

------------------------------------------------------------------------
On 2019-04-23T13:09:14+00:00 prajnoha wrote:

Normally, I'd add Before=local-fs-pre.target into blk-
availability.service so on shutdown its ExecStop would execute after all
local mount points are unmounted.

The problem might be with all the dependencies like iscsi, fcoe and
rbdmap services where we need to make sure that these are executed
*after* blk-availability. So I need to find a proper target that we can
hook on so that it also fits all the dependencies. It's possible we need
to create a completely new target so we can properly synchronize all the
services on shutdown. I'll see what I can do...

Reply at:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1832859/comments/1

------------------------------------------------------------------------
On 2019-04-23T13:17:39+00:00 rmetrich wrote:

Indeed, wasn't able to find a proper target, none exists.
I believe blk-availability itself needs to be modified to only deactivate 
non-local disks (hopefully there is a way to distinguish).

Reply at:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1832859/comments/2

------------------------------------------------------------------------
On 2019-06-19T13:34:15+00:00 rmetrich wrote:

Hi Peter,

Could you explain why blk-availability is needed when using multipath or iscsi?
With systemd ordering dependencies in units, is that really needed?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1832859/comments/15

------------------------------------------------------------------------
On 2019-06-21T08:43:50+00:00 prajnoha wrote:

(In reply to Renaud Métrich from comment #4)
> Hi Peter,
> 
> Could you explain why blk-availability is needed when using multipath or
> iscsi?
> With systemd ordering dependencies in units, is that really needed?

It is still needed because otherwise there wouldn't be anything else to
properly deactivate the stack. Even though, the blk-availability.service
with blkdeactivate call is still not perfect, it's still better than
nothing and letting systemd to shoot down the devices on its own within
its "last-resort" device deactivation loop that happens in shutdown
initramfs (here, the iscsi/fcoe and all the other devices are already
disconnected anyway, so anything else on top can't be properly
deactivated).

We've just received related report on github too
(https://github.com/lvmteam/lvm2/issues/18).

I'm revisiting this problem now. The correct solution requires more
patching - this part is very fragile at the moment (...easy to break
other functionality).

Reply at:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1832859/comments/17

------------------------------------------------------------------------
On 2019-06-21T08:47:41+00:00 prajnoha wrote:

(In reply to Renaud Métrich from comment #3)
> I believe blk-availability itself needs to be modified to only deactivate
> non-local disks (hopefully there is a way to distinguish).

It's possible that we need to split the blk-availability (and the
blkdeactivate) in two because of this... There is a way to distinguish I
hope (definitely for iscsi/fcoe), but there currently isn't a central
authority to decide on this so it must be done manually (checking
certain properties in sysfs "manually").

Reply at:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1832859/comments/18

------------------------------------------------------------------------
On 2019-06-21T08:51:28+00:00 rmetrich wrote:

I must be missing something. This service is used to deactivate "remote" block 
devices requiring the network, such as iscsi or fcoe.
Why aren't these services deactivating the block devices by themselves?
That way systemd won't kill everything abruptly.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1832859/comments/20

------------------------------------------------------------------------
On 2019-06-21T09:08:22+00:00 prajnoha wrote:

(In reply to Renaud Métrich from comment #7)
> I must be missing something. This service is used to deactivate "remote"
> block devices requiring the network, such as iscsi or fcoe.

Nope, ALL storage, remote as well as local, if possible. We need to look
at the complete stack (e.g. device-mapper devices which are layered on
top of other layers, are set up locally)

> Why aren't these services deactivating the block devices by
themselves?

Well, honestly, because nobody has ever solved that :)

At the beginning, it probably wasn't that necessary and if you just shut
your system down and let the devices as they are (unattached, not
deactivated), it wasn't such a problem. But now, with various caching
layers, thin pools... it's getting quite important to deactivate the
stack properly to also properly flush any metadata or data.

Of course, we still need to count with the situation where there's a
power outage and the machine is not backed by any other power source so
you'd have your machine shot down immediately (for that there are
various checking and fixing mechanism). But it's certainly better to
avoid this situation as you could still lose some data.

Systemd's loop in the shutdown initramfs is really the last-resort thing
to execute, but we can't rely on that (it's just a loop on device list
with limited loop count, it doesn't look at the real nature of that
layer in the stack).

Reply at:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1832859/comments/21

------------------------------------------------------------------------
On 2019-06-21T09:39:30+00:00 rmetrich wrote:

OK, then we need a "blk-availability-local" service and 
"blk-availability-remote" service and maybe associated targets, similar to 
"local-fs.target" and "remote-fs.target".
Probably this should be handled by systemd package itself, typically by 
analyzing the device properties when a device shows up in udev.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1832859/comments/22


** Changed in: lvm2 (Fedora)
       Status: Unknown => Confirmed

** Changed in: lvm2 (Fedora)
   Importance: Unknown => High

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1832859

Title:
  during shutdown libvirt-guests gets stopped after file system unmount

To manage notifications about this bug go to:
https://bugs.launchpad.net/lvm2/+bug/1832859/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to