This response was AI-generated by bug-bot. The analysis may contain errors - 
please verify independently.

Hi Alice,

Thanks for the detailed report. Here is my analysis.

___

1. Bug Summary

The platform-device-devm kunit test suite crashes with a general
protection fault in queued_spin_lock_slowpath() during device
registration, followed by cascading failures including sysfs duplicate
filename errors. The root issue is test isolation: earlier kunit tests
(including the intentional NULL dereference in kunit_test_null_dereference)
corrupt kernel state, and the platform-device-devm tests use raw
platform device APIs without kunit-managed cleanup, so they cannot
recover from or survive this corrupted state. The severity is a test
infrastructure issue, not a driver core bug.

2. Stack Trace Analysis

First crash (Oops #3 — two earlier oopses already occurred):

  Oops: general protection fault, probably for non-canonical address 
0xb4c3c33fcc9f57f6: 0000 [#3] SMP PTI
  CPU: 0 UID: 0 PID: 2500 Comm: kunit_try_catch Tainted: G      D  W        N  
7.0.0-rc1-00138-g0c21570fbd3d-dirty #3 PREEMPT(lazy)
  Tainted: [D]=DIE, [W]=WARN, [N]=TEST
  RIP: 0010:queued_spin_lock_slowpath+0x120/0x1c0
  RAX: b4c3c340405a5a26 RBX: ffffb222800e3ce8 RCX: 0000000000050000
  RDX: ffffa0a4fec1ddd0 RSI: 0000000000000010 RDI: ffffa0a4c2b43340
  Call Trace:
   <TASK>
   klist_iter_exit+0x2c/0x70
   ? __pfx___device_attach_driver+0x10/0x10
   bus_for_each_drv+0x12a/0x160
   __device_attach+0xbf/0x160
   device_initial_probe+0x2f/0x50
   bus_probe_device+0x8f/0x110
   device_add+0x23f/0x3d0
   platform_device_add+0x137/0x1d0
   platform_device_devm_register_unregister_test+0x6c/0x2e0
   kunit_try_run_case+0x8f/0x190
   kunit_generic_run_threadfn_adapter+0x1d/0x40
   kthread+0x142/0x160
   ret_from_fork+0xc7/0x1f0
   ret_from_fork_asm+0x1a/0x30
   </TASK>

The crash point is in queued_spin_lock_slowpath() at
kernel/locking/qspinlock.c, called from klist_iter_exit() at
lib/klist.c:311. RAX holds non-canonical address 0xb4c3c340405a5a26,
indicating corrupted klist data. The calling chain is process context:
platform_device_devm_register_unregister_test() calls
platform_device_add() -> device_add() -> bus_probe_device() ->
__device_attach() -> bus_for_each_drv() (drivers/base/bus.c:420)
which iterates the bus's klist_drivers. During klist_iter_exit(),
it tries to acquire the klist spinlock and hits corrupted memory.

Second failure (duplicate sysfs entry):

  sysfs: cannot create duplicate filename '/devices/platform/test'
  Call Trace:
   <TASK>
   dump_stack_lvl+0x2d/0x70
   sysfs_create_dir_ns+0xe8/0x130
   kobject_add_internal+0x1dd/0x360
   kobject_add+0x88/0xf0
   device_add+0x171/0x3d0
   platform_device_add+0x137/0x1d0
   platform_device_devm_register_get_unregister_with_devm_test+0x6c/0x2f0
   kunit_try_run_case+0x8f/0x190
   kunit_generic_run_threadfn_adapter+0x1d/0x40
   kthread+0x142/0x160
   ret_from_fork+0xc7/0x1f0
   ret_from_fork_asm+0x1a/0x30
   </TASK>

The assertion at drivers/base/test/platform-device-test.c:97 fails
with ret == -17 (EEXIST) because the first test crashed without
unregistering its device, leaving "/devices/platform/test" in sysfs.

3. Root Cause Analysis

This is a test isolation problem, not a driver core bug. Two issues
combine to cause the failures:

(a) Corrupted kernel state from earlier oopses. The Oops header shows
"[#3]" meaning this is the third kernel oops during the boot. The
taint flags [D]=DIE and [W]=WARN confirm prior fatal faults. The
kunit_test_null_dereference() function in lib/kunit/kunit-test.c:117
intentionally dereferences NULL to test kunit's fault handling. After
multiple oopses, kernel data structures (including the platform bus
klist) can be corrupted, which explains the non-canonical address
(0xb4c3c33fcc9f57f6) seen during spinlock acquisition.

(b) Missing test-managed cleanup. The four tests in
platform_device_devm_test_suite all use the raw kernel APIs
platform_device_alloc() and platform_device_add() directly, and all
use the same hardcoded name "test" with PLATFORM_DEVID_NONE
(drivers/base/test/platform-device-test.c:62-77). If a test crashes
before reaching platform_device_unregister(), the device remains
registered and subsequent tests cannot register a device with the
same name.

By contrast, the platform_device_find_by_null_test() in the same
file already uses the kunit-managed helpers kunit_platform_device_alloc()
and kunit_platform_device_add() from lib/kunit/platform.c (added in
commit 5ac79730324c "platform: Add test managed platform_device/driver
APIs"), which automatically unregister the device when the test exits,
even on crash.

4. Affected Versions

The platform-device-devm tests were introduced in commit b4cc44301b9d
("drivers: base: Add basic devm tests for platform devices") by
Maxime Ripard, which predates the kunit-managed platform device helpers
from commit 5ac79730324c. All kernel versions containing these tests
are affected by this test isolation issue. This is not a regression in
the driver core itself.

5. Relevant Commits and Fixes

- b4cc44301b9d ("drivers: base: Add basic devm tests for platform devices")
  Introduced the test suite with raw platform device APIs.

- 699fb50d99039 ("drivers: base: Free devm resources when unregistering a 
device")
  Fixed devm resource release for unprobed devices; updated test expectations.

- 5ac79730324c ("platform: Add test managed platform_device/driver APIs")
  Added kunit_platform_device_alloc()/kunit_platform_device_add()
  helpers that automatically clean up on test exit.

- 86a5f32ed8813 ("drivers: base: test: Add ...find_device_by...(... NULL) 
tests")
  Added platform_device_find_by_null_test which already uses the
  kunit-managed helpers correctly.

No fix for this specific test isolation issue exists in mainline yet.

6. Suggested Actions

The devm tests should be updated to use the kunit-managed platform
device helpers (kunit_platform_device_alloc/kunit_platform_device_add)
from include/kunit/platform_device.h, similar to how
platform_device_find_by_null_test() already does. This would ensure
proper cleanup even when tests crash.

One subtlety: the devm tests specifically test the behavior of
platform_device_unregister() releasing devm resources, so
kunit_platform_device_add() (which auto-unregisters) needs to be
used carefully — the kunit cleanup action should be removed or
disabled before the explicit unregister call to avoid double-free.
Alternatively, each test could use a unique device name via
PLATFORM_DEVID_AUTO to at least prevent the cascading sysfs duplicate
errors.

For the probed tests (probed_platform_device_devm_register_unregister_test
and probed_platform_device_devm_register_get_unregister_with_devm_test),
there is a similar need to register/unregister the fake_driver with
kunit-managed helpers like kunit_platform_driver_register().

In the short term, you can work around this by running the
platform-device-devm suite in isolation:

  ./tools/testing/kunit/kunit.py run --make_options LLVM=1 \
    --arch x86_64 --kconfig_add CONFIG_RUST=y \
    --kconfig_add CONFIG_PCI=y platform-device-devm

This avoids the corrupted state from earlier intentional-crash tests.


Reply via email to