This response was AI-generated by bug-bot. The analysis may contain errors -
please verify independently.
Hi Alice,
Thanks for the detailed report. Here is my analysis.
___
1. Bug Summary
The platform-device-devm kunit test suite crashes with a general
protection fault in queued_spin_lock_slowpath() during device
registration, followed by cascading failures including sysfs duplicate
filename errors. The root issue is test isolation: earlier kunit tests
(including the intentional NULL dereference in kunit_test_null_dereference)
corrupt kernel state, and the platform-device-devm tests use raw
platform device APIs without kunit-managed cleanup, so they cannot
recover from or survive this corrupted state. The severity is a test
infrastructure issue, not a driver core bug.
2. Stack Trace Analysis
First crash (Oops #3 — two earlier oopses already occurred):
Oops: general protection fault, probably for non-canonical address
0xb4c3c33fcc9f57f6: 0000 [#3] SMP PTI
CPU: 0 UID: 0 PID: 2500 Comm: kunit_try_catch Tainted: G D W N
7.0.0-rc1-00138-g0c21570fbd3d-dirty #3 PREEMPT(lazy)
Tainted: [D]=DIE, [W]=WARN, [N]=TEST
RIP: 0010:queued_spin_lock_slowpath+0x120/0x1c0
RAX: b4c3c340405a5a26 RBX: ffffb222800e3ce8 RCX: 0000000000050000
RDX: ffffa0a4fec1ddd0 RSI: 0000000000000010 RDI: ffffa0a4c2b43340
Call Trace:
<TASK>
klist_iter_exit+0x2c/0x70
? __pfx___device_attach_driver+0x10/0x10
bus_for_each_drv+0x12a/0x160
__device_attach+0xbf/0x160
device_initial_probe+0x2f/0x50
bus_probe_device+0x8f/0x110
device_add+0x23f/0x3d0
platform_device_add+0x137/0x1d0
platform_device_devm_register_unregister_test+0x6c/0x2e0
kunit_try_run_case+0x8f/0x190
kunit_generic_run_threadfn_adapter+0x1d/0x40
kthread+0x142/0x160
ret_from_fork+0xc7/0x1f0
ret_from_fork_asm+0x1a/0x30
</TASK>
The crash point is in queued_spin_lock_slowpath() at
kernel/locking/qspinlock.c, called from klist_iter_exit() at
lib/klist.c:311. RAX holds non-canonical address 0xb4c3c340405a5a26,
indicating corrupted klist data. The calling chain is process context:
platform_device_devm_register_unregister_test() calls
platform_device_add() -> device_add() -> bus_probe_device() ->
__device_attach() -> bus_for_each_drv() (drivers/base/bus.c:420)
which iterates the bus's klist_drivers. During klist_iter_exit(),
it tries to acquire the klist spinlock and hits corrupted memory.
Second failure (duplicate sysfs entry):
sysfs: cannot create duplicate filename '/devices/platform/test'
Call Trace:
<TASK>
dump_stack_lvl+0x2d/0x70
sysfs_create_dir_ns+0xe8/0x130
kobject_add_internal+0x1dd/0x360
kobject_add+0x88/0xf0
device_add+0x171/0x3d0
platform_device_add+0x137/0x1d0
platform_device_devm_register_get_unregister_with_devm_test+0x6c/0x2f0
kunit_try_run_case+0x8f/0x190
kunit_generic_run_threadfn_adapter+0x1d/0x40
kthread+0x142/0x160
ret_from_fork+0xc7/0x1f0
ret_from_fork_asm+0x1a/0x30
</TASK>
The assertion at drivers/base/test/platform-device-test.c:97 fails
with ret == -17 (EEXIST) because the first test crashed without
unregistering its device, leaving "/devices/platform/test" in sysfs.
3. Root Cause Analysis
This is a test isolation problem, not a driver core bug. Two issues
combine to cause the failures:
(a) Corrupted kernel state from earlier oopses. The Oops header shows
"[#3]" meaning this is the third kernel oops during the boot. The
taint flags [D]=DIE and [W]=WARN confirm prior fatal faults. The
kunit_test_null_dereference() function in lib/kunit/kunit-test.c:117
intentionally dereferences NULL to test kunit's fault handling. After
multiple oopses, kernel data structures (including the platform bus
klist) can be corrupted, which explains the non-canonical address
(0xb4c3c33fcc9f57f6) seen during spinlock acquisition.
(b) Missing test-managed cleanup. The four tests in
platform_device_devm_test_suite all use the raw kernel APIs
platform_device_alloc() and platform_device_add() directly, and all
use the same hardcoded name "test" with PLATFORM_DEVID_NONE
(drivers/base/test/platform-device-test.c:62-77). If a test crashes
before reaching platform_device_unregister(), the device remains
registered and subsequent tests cannot register a device with the
same name.
By contrast, the platform_device_find_by_null_test() in the same
file already uses the kunit-managed helpers kunit_platform_device_alloc()
and kunit_platform_device_add() from lib/kunit/platform.c (added in
commit 5ac79730324c "platform: Add test managed platform_device/driver
APIs"), which automatically unregister the device when the test exits,
even on crash.
4. Affected Versions
The platform-device-devm tests were introduced in commit b4cc44301b9d
("drivers: base: Add basic devm tests for platform devices") by
Maxime Ripard, which predates the kunit-managed platform device helpers
from commit 5ac79730324c. All kernel versions containing these tests
are affected by this test isolation issue. This is not a regression in
the driver core itself.
5. Relevant Commits and Fixes
- b4cc44301b9d ("drivers: base: Add basic devm tests for platform devices")
Introduced the test suite with raw platform device APIs.
- 699fb50d99039 ("drivers: base: Free devm resources when unregistering a
device")
Fixed devm resource release for unprobed devices; updated test expectations.
- 5ac79730324c ("platform: Add test managed platform_device/driver APIs")
Added kunit_platform_device_alloc()/kunit_platform_device_add()
helpers that automatically clean up on test exit.
- 86a5f32ed8813 ("drivers: base: test: Add ...find_device_by...(... NULL)
tests")
Added platform_device_find_by_null_test which already uses the
kunit-managed helpers correctly.
No fix for this specific test isolation issue exists in mainline yet.
6. Suggested Actions
The devm tests should be updated to use the kunit-managed platform
device helpers (kunit_platform_device_alloc/kunit_platform_device_add)
from include/kunit/platform_device.h, similar to how
platform_device_find_by_null_test() already does. This would ensure
proper cleanup even when tests crash.
One subtlety: the devm tests specifically test the behavior of
platform_device_unregister() releasing devm resources, so
kunit_platform_device_add() (which auto-unregisters) needs to be
used carefully — the kunit cleanup action should be removed or
disabled before the explicit unregister call to avoid double-free.
Alternatively, each test could use a unique device name via
PLATFORM_DEVID_AUTO to at least prevent the cascading sysfs duplicate
errors.
For the probed tests (probed_platform_device_devm_register_unregister_test
and probed_platform_device_devm_register_get_unregister_with_devm_test),
there is a similar need to register/unregister the fake_driver with
kunit-managed helpers like kunit_platform_driver_register().
In the short term, you can work around this by running the
platform-device-devm suite in isolation:
./tools/testing/kunit/kunit.py run --make_options LLVM=1 \
--arch x86_64 --kconfig_add CONFIG_RUST=y \
--kconfig_add CONFIG_PCI=y platform-device-devm
This avoids the corrupted state from earlier intentional-crash tests.