Package: linux-image-5.10.0-9-amd64
Version: linux-image-5.10.0-9-amd64 and linux-image-5.14.0-0.bpo.2-amd64
Severity: important

Dear All,

I am reporting this bug mostly to help others with the same problem, proposing
adding a warning to the Debian 11 release notes and hoping for an upstream
kernel bugfix.

Description: Debian 11 Xen PV DomU (RAM<4GB) does not correctly shuts down
because of a intel_pmc_core module problems on Intel Xeon E3-1230 (and possibly
other Intel CPUs).

https://github.com/QubesOS/qubes-issues/issues/6052 seems to be the same issue.

Workarounds:
* Use a Debian 10 kernel in the DomU, which works
* Allocate 4+ GB RAM to the DomU
* Use PVH instead of PV (needs Xen 4.9+, and is the preferred way since Xen
4.10)

Please note:
* Backports kernel (linux-image-5.14.0-0.bpo.2-amd64) suffers from the same
problem.
* Debian 10 Dom0 Xen 4.11.4+107-gef32c7afa2-1 beheaves the same way
* Debian 9 Dom0 Xen 4.8.5.final+shim4.10.4-1+deb9u12 used PVHv1, which differs
from PVHv2 used by Xen 4.09+

Test case:
* Install Debian 10 or Debian 11, install Xen, create a PV config as below and
upon startup "BUG: unable to handle page fault for address" is displayed and it
fails to stop with "poweroff" later.
kernel = "/usr/lib/grub-xen/grub-x86_64-xen.bin"
extra = '(hd1)/boot/grub/grub.cfg'
* Change PV to PVH and it works correctly:
kernel = "/root/xen/images/debian11/vmlinuz-5.10.0-9-amd64"
ramdisk = "/root/xen/images/debian11/initrd.img-5.10.0-9-amd64"
type = 'pvh'

The full bug in my case:
[    3.088164] BUG: unable to handle page fault for address: ffffc9004049b818
[    3.088175] #PF: supervisor read access in kernel mode
[    3.088179] #PF: error_code(0x0000) - not-present page
[    3.088183] PGD 7fbd9067 P4D 7fbd9067 PUD 5186067 PMD 5303067 PTE 0
[    3.088191] Oops: 0000 [#1] SMP NOPTI
[    3.088195] CPU: 0 PID: 201 Comm: systemd-udevd Not tainted 5.10.0-9-amd64
#1 Debian 5.10.70-1
[    3.088204] RIP: e030:pmc_core_probe+0x136/0x410 [intel_pmc_core]
[    3.088209] Code: c0 48 c7 c7 48 a6 3c c0 e8 c7 25 d2 c0 48 8b 05 b0 7a 00
00 48 c7 83 88 00 00 00 20 a6 3c c0 48 63 40 50 48 03 05 92 7a 00 00 <8b> 00 48
8b 15 91 7a 00 00 48 c7 c7 e0 54 3c c0 8b 4a 54 ba 01 00
[    3.088222] RSP: e02b:ffffc9004026fc30 EFLAGS: 00010286
[    3.088226] RAX: ffffc9004049b818 RBX: ffff88800b028400 RCX:
00000000fe002000
[    3.088232] RDX: ffffffffc03ca600 RSI: ffffffffc03c41f6 RDI:
ffffffffc03ca648
[    3.088238] RBP: ffff88800b028410 R08: 0000000000000000 R09:
00000000fe001fff
[    3.088244] R10: 0000000000007ff0 R11: ffff888008e01740 R12:
0000000000000000
[    3.088249] R13: 0000000000000000 R14: 0000000000000006 R15:
0000000000000000
[    3.088260] FS:  00007f94ad4928c0(0000) GS:ffff88807d400000(0000)
knlGS:0000000000000000
[    3.088267] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.088271] CR2: ffffc9004049b818 CR3: 00000000075d4000 CR4:
0000000000050660
[    3.088281] Call Trace:
[    3.088288]  platform_drv_probe+0x35/0x80
[    3.088294]  really_probe+0x37b/0x480
[    3.088299]  driver_probe_device+0xe1/0x150
[    3.088303]  ? driver_allows_async_probing+0x50/0x50
[    3.088308]  bus_for_each_drv+0x7e/0xc0
[    3.088313]  __device_attach+0xd8/0x1d0
[    3.088317]  bus_probe_device+0x8e/0xa0
[    3.088321]  device_add+0x399/0x840
[    3.088325]  platform_device_add+0x105/0x230
[    3.088331]  ? 0xffffffffc0327000
[    3.088351]  pmc_core_platform_init+0x78/0x1000 [intel_pmc_core_pltdrv]
[    3.088358]  do_one_initcall+0x44/0x1d0
[    3.088363]  ? do_init_module+0x23/0x260
[    3.088381]  ? kmem_cache_alloc_trace+0xf5/0x200
[    3.088386]  do_init_module+0x5c/0x260
[    3.088391]  __do_sys_finit_module+0xb1/0x110
[    3.088397]  do_syscall_64+0x33/0x80
[    3.088402]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    3.088407] RIP: 0033:0x7f94ad94b9b9
[    3.088411] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89
f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
f0 ff ff 73 01 c3 48 8b 0d a7 54 0c 00 f7 d8 64 89 01 48
[    3.088423] RSP: 002b:00007ffde7aa3158 EFLAGS: 00000246 ORIG_RAX:
0000000000000139
[    3.088430] RAX: ffffffffffffffda RBX: 0000563dc68d1530 RCX:
00007f94ad94b9b9
[    3.088435] RDX: 0000000000000000 RSI: 00007f94adad6e2d RDI:
0000000000000018
[    3.088441] RBP: 0000000000020000 R08: 0000000000000000 R09:
0000563dc689c9d0
[    3.088447] R10: 0000000000000018 R11: 0000000000000246 R12:
00007f94adad6e2d
[    3.088453] R13: 0000000000000000 R14: 0000563dc68ce450 R15:
0000563dc68d1530
[    3.088459] Modules linked in: intel_pmc_core_pltdrv(+) intel_pmc_core
ghash_clmulni_intel evdev aesni_intel libaes crypto_simd cryptd glue_helper
pcspkr drm fuse configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2
crc32c_generic crct10dif_pclmul crct10dif_common crc32_pclmul xen_netfront
xen_blkfront crc32c_intel
[    3.088487] CR2: ffffc9004049b818
[    3.088491] ---[ end trace dd3aec620db68a1d ]---
[    3.088496] RIP: e030:pmc_core_probe+0x136/0x410 [intel_pmc_core]
[    3.088501] Code: c0 48 c7 c7 48 a6 3c c0 e8 c7 25 d2 c0 48 8b 05 b0 7a 00
00 48 c7 83 88 00 00 00 20 a6 3c c0 48 63 40 50 48 03 05 92 7a 00 00 <8b> 00 48
8b 15 91 7a 00 00 48 c7 c7 e0 54 3c c0 8b 4a 54 ba 01 00
[    3.088514] RSP: e02b:ffffc9004026fc30 EFLAGS: 00010286
[    3.088519] RAX: ffffc9004049b818 RBX: ffff88800b028400 RCX:
00000000fe002000
[    3.088524] RDX: ffffffffc03ca600 RSI: ffffffffc03c41f6 RDI:
ffffffffc03ca648
[    3.088530] RBP: ffff88800b028410 R08: 0000000000000000 R09:
00000000fe001fff
[    3.088536] R10: 0000000000007ff0 R11: ffff888008e01740 R12:
0000000000000000
[    3.088541] R13: 0000000000000000 R14: 0000000000000006 R15:
0000000000000000
[    3.088552] FS:  00007f94ad4928c0(0000) GS:ffff88807d400000(0000)
knlGS:0000000000000000
[    3.088558] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.088563] CR2: ffffc9004049b818 CR3: 00000000075d4000 CR4:
0000000000050660

Please ignore "Other system information" as I have to report this from a
different machine due to network separation.



-- System Information:
Debian Release: 10.11
  APT prefers oldstable-updates
  APT policy: (500, 'oldstable-updates'), (500, 'oldstable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.19.0-18-amd64 (SMP w/8 CPU cores)
Kernel taint flags: TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), 
LANGUAGE=en_US:en (charmap=UTF-8)
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Reply via email to