Module Name:    src
Committed By:   martin
Date:           Sat Feb  3 12:41:30 UTC 2024

Modified Files:
        src/sys/dev/sdmmc [netbsd-9]: ld_sdmmc.c

Log Message:
Pull up following revision(s) (requested by riastradh in ticket #1793):

        sys/dev/sdmmc/ld_sdmmc.c: revision 1.43

ld@sdmmc(4): Hack around deadlock in cache sync on detach.

Yanking a card triggers the sdmmc discovery task, which runs in the
sdmmc task thread, to detach any attached child devices.

Detaching ld@sdmmc triggers a cache flush (via ldbegindetach ->
disk_begindetach -> ld_lastclose -> ld_flush -> ioctl DIOCCACHESYNC),
which is implemented by scheduling a task to do sdmmc_mem_flush_cache
and then waiting for it to complete.

The sdmmc_mem_cache_flush is done by an sdmmc task so it happens
after all previously scheduled I/O operations -- that way the cache
flush doesn't complete until the previously scheduled I/O operations
are complete.

However, when the cache flush task is issued from the discovery task,
this doesn't work, because the cache flush task can't start until the
discovery task has returned -- but the discovery task won't return
until the cache flush task has completed.

To work around this deadlock, which usually happens only when the
device has been yanked anyway so further I/O would be lost anyway,
just do the cache flush synchronously in DIOCCACHESYNC if we're
running in the task thread.

This isn't quite right -- implementation details of the task thread
shouldn't bleed into ld@sdmmc, and running the cache sync _before_
any subsequently scheduled I/O tasks is asking for trouble -- but it
should serve to avoid the deadlock in PR kern/57870 until we can fix
a host of concurrency bugs in sdmmc by fixing the locking scheme and
running discovery in a separate thread from tasks.


To generate a diff of this commit:
cvs rdiff -u -r1.36.4.1 -r1.36.4.2 src/sys/dev/sdmmc/ld_sdmmc.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

Modified files:

Index: src/sys/dev/sdmmc/ld_sdmmc.c
diff -u src/sys/dev/sdmmc/ld_sdmmc.c:1.36.4.1 src/sys/dev/sdmmc/ld_sdmmc.c:1.36.4.2
--- src/sys/dev/sdmmc/ld_sdmmc.c:1.36.4.1	Sun Aug  9 14:03:07 2020
+++ src/sys/dev/sdmmc/ld_sdmmc.c	Sat Feb  3 12:41:29 2024
@@ -1,4 +1,4 @@
-/*	$NetBSD: ld_sdmmc.c,v 1.36.4.1 2020/08/09 14:03:07 martin Exp $	*/
+/*	$NetBSD: ld_sdmmc.c,v 1.36.4.2 2024/02/03 12:41:29 martin Exp $	*/
 
 /*
  * Copyright (c) 2008 KIYOHARA Takashi
@@ -28,7 +28,7 @@
  */
 
 #include <sys/cdefs.h>
-__KERNEL_RCSID(0, "$NetBSD: ld_sdmmc.c,v 1.36.4.1 2020/08/09 14:03:07 martin Exp $");
+__KERNEL_RCSID(0, "$NetBSD: ld_sdmmc.c,v 1.36.4.2 2024/02/03 12:41:29 martin Exp $");
 
 #ifdef _KERNEL_OPT
 #include "opt_sdmmc.h"
@@ -589,9 +589,24 @@ static int
 ld_sdmmc_cachesync(struct ld_softc *ld, bool poll)
 {
 	struct ld_sdmmc_softc *sc = device_private(ld->sc_dv);
+	struct sdmmc_softc *sdmmc = device_private(device_parent(ld->sc_dv));
 	struct ld_sdmmc_task *task;
 	int error = -1;
 
+	/*
+	 * If we come here through the sdmmc discovery task, we can't
+	 * wait for a new task because the new task can't even begin
+	 * until the sdmmc discovery task has completed.
+	 *
+	 * XXX This is wrong, because there may already be queued I/O
+	 * tasks ahead of us.  Fixing this properly requires doing
+	 * discovery in a separate thread.  But this should avoid the
+	 * deadlock of PR kern/57870 (https://gnats.NetBSD.org/57870)
+	 * until we do split that up.
+	 */
+	if (curlwp == sdmmc->sc_tskq_lwp)
+		return sdmmc_mem_flush_cache(sc->sc_sf, poll);
+
 	mutex_enter(&sc->sc_lock);
 
 	/* Acquire a free task, or fail with EBUSY.  */

Reply via email to