On Sun, Feb 26, 2023 at 01:28:04PM +0100, Mark Kettenis wrote:
> > Date: Sun, 26 Feb 2023 18:13:18 +1000
> > From: David Gwynne <da...@gwynne.id.au>

yeesh, i should have proofread my email before i sent it. sorry about
making it harder to read than it should have been.

> > i picked a couple of Dell Wyse 3040 boxes, which are very cute, i
> > like them a lot. however, i have to disable acpitz to be able to
> > use them because the driver gets stuck during attach.
> > 
> > during apcitz_attach does a read of all the temperatures. the read
> > of _TMP ends up talking to tipmic(4) via tipmic_thermal_opreg_handler().
> > tipmic_thermal_opreg_handler has a loop on line 335 waiting for
> > sc->sc_stat_adc to change, but that value is only set from tipmic_intr.
> > acpitz_attach is running while the kernel is code, and it appears that
> > the interrupt handler never runs, so that value never changes, and
> > acpitz blocks. also because it's cold, the timeout on the tsleep doesn't
> > do anything. thanks to patrick for helping me on the acpi side of things
> > so we could figure this out.
> 
> A better approach might be to make sure that while we're cold,
> tipmic_thermal_opreg_handler() polls for completion.  Something like:
> 
>       while (sc->sc_stat_adc == 0) {
>               if (cold) {
>                       delay(1000);
>                       tpmic_intr();
>               } else {
>                       if (tsleep(&sc->sc_stat_adc, PRIBIO, "tipmic",
>                           SEC_TO_NSEC(1))) {
>                               ...
>                       }
>               }
>       }       
> 
> 
> > i tried deferring basically all of acpitz_attach to when kthreads are
> > running, and that works well enough to get to userland.
> > 
> > is that reasonable?
> 
> The problem is that you can't really know whether AML accesses the
> opregion while cold.

good point. the diff below works in this situation and is less
intrusive.

> > also, shortly after dwiic complains about short reads and the kernel
> > locks up again. i'll have to plug it in and transcribe the exact
> > errors. i think that's a separate problem though.
> 
> Yes, dwiic(4) has always been somewhat problematic.  Transactions seem
> to fail randomly on some platforms like the atom system you're looking
> at but also on my Ampere eMAG system.

fun. i managed to catch some of the dwiic stuff via dmesg before
it locked up:

dwiic0: timed out waiting for tx_empty intr
dwiic0: timed out waiting for rx_full intr
dwiic0: timed out reading remaining 1
tipmic0: can't read register 0x5b
dwiic0: timed out waiting for tx_empty intr
dwiic0: timed out reading remaining 1
tipmic0: can't read register 0x01
dwiic0: timed out reading remaining 1
tipmic0: can't read register 0x01
dwiic0: timed out waiting for rx_full intr
dwiic0: timed out reading remaining 1
tipmic0: can't read register 0x5a
dwiic0: timed out waiting for rx_full intr
dwiic0: timed out reading remaining 1
tipmic0: can't read register 0x50
dwiic0: timed out waiting for stop intr
dwiic0: timed out waiting for stop intr
dwiic0: timed out waiting for stop intr
dwiic0: timed out reading remaining 1
tipmic0: can't read register 0x01
dwiic0: timed out waiting for bus idle
dwiic0: timed out waiting for stop intr
dwiic0: timed out waiting for stop intr
dwiic0: timed out waiting for stop intr
dwiic0: timed out waiting for stop intr
dwiic0: timed out waiting for stop intr
dwiic0: timed out reading remaining 1
tipmic0: can't read register 0x01
dwiic0: timed out waiting for bus idle

Index: tipmic.c
===================================================================
RCS file: /cvs/src/sys/dev/acpi/tipmic.c,v
retrieving revision 1.7
diff -u -p -r1.7 tipmic.c
--- tipmic.c    6 Apr 2022 18:59:27 -0000       1.7
+++ tipmic.c    26 Feb 2023 23:56:04 -0000
@@ -276,6 +276,25 @@ struct tipmic_regmap tipmic_thermal_regm
        { 0x18, TIPMIC_SYSTEMP_HI, TIPMIC_SYSTEMP_LO }
 };
 
+static int
+tipmic_wait_adc(struct tipmic_softc *sc)
+{
+       int i;
+
+       if (!cold) {
+               return (tsleep_nsec(&sc->sc_stat_adc, PRIBIO, "tipmic",
+                   SEC_TO_NSEC(1)));
+       }
+
+       for (i = 0; i < 1000; i++) {
+               delay(1000);
+               if (tipmic_intr(sc) == 1)
+                       return (0);
+       }
+
+       return (EWOULDBLOCK);
+}
+
 int
 tipmic_thermal_opreg_handler(void *cookie, int iodir, uint64_t address,
     int size, uint64_t *value)
@@ -333,8 +352,7 @@ tipmic_thermal_opreg_handler(void *cooki
        splx(s);
 
        while (sc->sc_stat_adc == 0) {
-               if (tsleep_nsec(&sc->sc_stat_adc, PRIBIO, "tipmic",
-                   SEC_TO_NSEC(1))) {
+               if (tipmic_wait_adc(sc)) {
                        printf("%s: ADC timeout\n", sc->sc_dev.dv_xname);
                        break;
                }

Reply via email to