Hi Kaihenfeng,
Thanks for your patch suggestion! I'm semantically not sure it is the right 
thing - to clarify your theory is that before it checked !resuming and before 
had the check for !cdev maybe just to avoid a deference error. And now you 
assume that instead of !cdev it should check if there is a cdev there.
I'm unsure - if !cdev was indeed just to protect the dereference then maybe no 
check at all might be better. Which would then read "if the event is 
IO_SCH_ORPH_UNREG or IO_SCH_UNREG then do css_sch_device_unregister.

But that I'm not immediately convinced doesn't mean much and it is easy
to test and surely worth a try, so I ran v5.11 (bad) plus your patch and
the result will be useful to know in any case. It is working fine, that
much I can tell you.

But if my thought above was right (it was only there to avoid the potential 
deference error), then why check it at all. If the condition cdev==NULL is 
possible it would now skip to to fully remove it - we might not need that at 
all.
And Since I brought up the idea of dropping the cdev check entirely that was 
worth a try as well. So now the third check of this morning is for:
--- a/drivers/s390/cio/device.c
+++ b/drivers/s390/cio/device.c
@@ -1525,8 +1525,7 @@ static int io_subchannel_sch_event(struct subchannel 
*sch, int process)
        switch (action) {
        case IO_SCH_ORPH_UNREG:
        case IO_SCH_UNREG:
-               if (!cdev)
-                       css_sch_device_unregister(sch);
+               css_sch_device_unregister(sch);
                break;
        case IO_SCH_ORPH_ATTACH:
        case IO_SCH_UNREG_ATTACH:

My patch with that change - in my test - is working as well.
Neither of the solutions has triggered other regressions in my setup - but then 
there are so many potential use-cases that I can't be sure without a further 
revew by subject matter experts.

So a summary of the recent tests:

5.11.0-16-generic #17+lp1925211v202104201520 (Seths full revert) - working
5.11.0lp1925211-patch-kaihengfeng-dirty - working
5.11.0nocdevcheck-paelzer-dirty - working

I think we'd want an answer from the IBM devs which solution (full
revert, kaihenfeng patch, cpaelzer patch, another approach) they would
prefer - then we can submit it upstream  for them to include officially
and we can carry it as delta until we rebase onto a version that has it
applied anyway.

[1]:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8cc0dcfdc1c0e0be107d0288f9c0cf1f4201be62

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1925211

Title:
  Hot-unplug of disks leaves broken block devices around in Hirsute on
  s390x

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  Confirmed
Status in systemd package in Ubuntu:
  New
Status in udev package in Ubuntu:
  New
Status in linux source package in Hirsute:
  Confirmed
Status in systemd source package in Hirsute:
  New
Status in udev source package in Hirsute:
  New

Bug description:
  Repro:
  #1 Get a guest
  $ uvt-kvm create --disk 5  --password=ubuntu h release=hirsute arch=s390x 
label=daily
  $ uvt-kvm wait h release=hirsute arch=s390x label=daily

  #2 Attach and Detach disk
  $ sudo qemu-img create -f qcow2 /var/lib/libvirt/images/test.qcow2 10M
  $ virsh attach-disk h /var/lib/libvirt/images/test.qcow2 vdc
  $ virsh detach-disk h vdc

  
  From libvirts POV it is gone at this point
  $ virsh domblklist h
   Target   Source
  ------------------------------------------------------------------
   vda      /var/lib/uvtool/libvirt/images/hirsute-2nd-zfs.qcow
   vdb      /var/lib/uvtool/libvirt/images/hirsute-2nd-zfs-ds.qcow

  But the guest thinks still it is present
  $ uvt-kvm ssh --insecure hirsute-2nd-zfs lsblk
    ...
    vdc    252:32   0   20M  0 disk

  This even remains a while after (not a race).

  Any access to it in the guest will hang (as you'd expect of a non-existing 
blockdev)
  4     0    1758    1739  20   0  12140  4800 -      S+   pts/0      0:00  |   
        \_ sudo mkfs.ext4 /dev/vdc
  4     0    1759    1758  20   0   6924  1044 -      D+   pts/0      0:00  |   
            \_ mkfs.ext4 /dev/vdc

  The result above was originally found with hirsute-guest@hirsute-host
  on s390x

  I do NOT see the same with  groovy-guest@hirsute-host on s390x
  I DO see the same with hirsute-guest@groovy-host on s390x
    => Guest version dependent not Host/Hipervisor dependent
  I DO see the same with ZFS disks AND LVM disks being added&removed
    => not type dependent
  I do NOT see the same on x86.
    => Arch dependent ??

  ... the evidence slowly points towards an issue in the guest, damn we are so
  close to release - but non-fully detaching disks are critical in my POV :-/

  Filing this as-is for awareness, but certainly this will need more debugging.
  Unsure where this is going to eventually I'll now file it for 
kernel/udev/systemd.
  If there are any known issues/components that are related let me know please!
  --- 
  ProblemType: Bug
  AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 
2: ls: cannot access '/dev/snd/': No such file or directory
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu65
  Architecture: s390x
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  CRDA: N/A
  CasperMD5CheckResult: unknown
  DistroRelease: Ubuntu 21.04
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lspci:
   
  Lspci-vt: -[0000:00]-
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  Lsusb-t: Error: command ['lsusb', '-t'] failed with exit code 1: 
/sys/bus/usb/devices: No such file or directory
  Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1:
  Package: udev
  PackageArchitecture: s390x
  PciMultimedia:
   
  ProcFB:
   
  ProcKernelCmdLine: root=LABEL=cloudimg-rootfs
  ProcVersionSignature: User Name 5.11.0-14.15-generic 5.11.12
  RelatedPackageVersions:
   linux-restricted-modules-5.11.0-14-generic N/A
   linux-backports-modules-5.11.0-14-generic  N/A
   linux-firmware                             N/A
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  Tags:  hirsute uec-images
  Uname: Linux 5.11.0-14-generic s390x
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: adm audio cdrom dialout dip floppy lxd netdev plugdev sudo video
  _MarkForUpload: True
  acpidump:

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1925211/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to