On 3/6/25 09:56, Warner Losh wrote:
On Thu, Mar 6, 2025, 5:33 AM John Baldwin <j...@freebsd.org> wrote:
On 3/6/25 06:35, Mateusz Guzik wrote:
On Thu, Mar 6, 2025 at 12:32 PM Zhenlei Huang <z...@freebsd.org> wrote:
On Mar 6, 2025, at 7:03 PM, Mateusz Guzik <m...@freebsd.org> wrote:
The branch main has been updated by mjg:
URL:
https://cgit.FreeBSD.org/src/commit/?id=234683726708cf5212d672d676d30056d4133859
commit 234683726708cf5212d672d676d30056d4133859
Author: Mateusz Guzik <m...@freebsd.org>
AuthorDate: 2025-03-06 11:01:49 +0000
Commit: Mateusz Guzik <m...@freebsd.org>
CommitDate: 2025-03-06 11:01:49 +0000
devclass: make devclass_alloc_unit use M_NOWAIT
The only caller already does this.
The routine can be called with a mutex held making M_WAITOK illegal.
Sponsored by: Rubicon Communications, LLC ("Netgate")
---
sys/kern/subr_bus.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/sys/kern/subr_bus.c b/sys/kern/subr_bus.c
index 9506e471705c..0422352bba51 100644
--- a/sys/kern/subr_bus.c
+++ b/sys/kern/subr_bus.c
@@ -1208,6 +1208,7 @@ devclass_get_sysctl_tree(devclass_t dc)
static int
devclass_alloc_unit(devclass_t dc, device_t dev, int *unitp)
{
+ device_t *devices;
const char *s;
int unit = *unitp;
@@ -1264,8 +1265,11 @@ devclass_alloc_unit(devclass_t dc, device_t dev,
int *unitp)
int newsize;
newsize = unit + 1;
- dc->devices = reallocf(dc->devices,
- newsize * sizeof(*dc->devices), M_BUS, M_WAITOK);
+ devices = reallocf(dc->devices,
+ newsize * sizeof(*dc->devices), M_BUS, M_NOWAIT);
I'd recommend against this. From the commit message of f3d3c63442ff,
Warner said,
In addition, transition to M_WAITOK since this is a sleepable context
So, the M_WAITOK is intentional.
Rather than reverting this, the caller devclass_add_device() should use
M_WAITOK.
Per my commit message this is callable from a *NOT* sleepable context.
Here is a splat we got at Netgate:
uma_zalloc_debug: zone "malloc-16" with the following non-sleepable
locks held:
exclusive sleep mutex SD slot mtx (sdhci) r = 0 (0xd8dec028) locked @
/var/jenkins/workspace/pfSense-Plus-snapshots-25_03-main/sources/FreeBSD-src-plus-RELENG_25_03/sys/dev/sdhci/sdhci.c:688
stack backtrace:
#0 0xc0330ebc at witness_debugger+0x78
#1 0xc033217c at witness_warn+0x428
#2 0xc05b0a58 at uma_zalloc_debug+0x34
#3 0xc05b067c at uma_zalloc_arg+0x30
#4 0xc0291760 at malloc+0x8c
#5 0xc02920ec at reallocf+0x14
#6 0xc02f8894 at devclass_add_device+0x1e8
#7 0xc02f6c78 at make_device+0xe0
#8 0xc02f6abc at device_add_child_ordered+0x30
#9 0xc0156e0c at sdhci_card_task+0x238
#10 0xc0324090 at taskqueue_run_locked+0x1b4
#11 0xc0323ea0 at taskqueue_run+0x50
#12 0xc0275f88 at ithread_loop+0x264
Just use a regular taskqueue like taskqueue_thread instead of
taskqueue_swi?
PCI hotplug defines its own thread taskqueue for adding and removing
devices.
The bug is here, IMO. Eventually new-bus will need some sort of topology
lock and that will have to be an sx lock, so this code needs to change
anyway. The sound code that tries to frob devices with a regular mutex
also needs to change.
I should dust off the branch that i have this one. There's about a dozen
places I had to change at the time...
In terms of taskqueue_swi, it's probably something that needs to go away.
Generally speaking, code uses tasks for functions that need to sleep,
and taskqueue_swi breaks that.
Its a holdover from spl days for sure.
I will fix sdhci to use a proper taskqueue and then revert this commit.
Thanks. You can add me to the review
It turned into a series. I'll post for review in a bit.
--
John Baldwin