On Thu, Mar 24, 2022 at 07:41:44AM +0100, Anton Lindqvist wrote: > >Synopsis: bse: null dereference in genet_rxintr() > >Category: arm64 > >Environment: > System : OpenBSD 7.1 > Details : OpenBSD 7.1-beta (GENERIC.MP) #1594: Mon Mar 21 06:55:12 > MDT 2022 > > [email protected]:/usr/src/sys/arch/arm64/compile/GENERIC.MP > > Architecture: OpenBSD.arm64 > Machine : arm64 > >Description: > > Booting my rpi4 often but not always causes a panic while rc(8) tries to start > the bse network interface: > > panic: attempt to access user address 0x38 from EL1 > Stopped at panic+0x160: cmp w21, #0x0 > TID PID UID PRFLAGS PFLAGS CPU COMMAND > * 0 0 0 0x10000 0x200 0K swapper > db_enter() at panic+0x15c > panic() at do_el1h_sync+0x1f8 > do_el1h_sync() at handle_el1h_sync+0x6c > handle_el1h_sync() at genet_rxintr+0x120 > genet_rxintr() at genet_intr+0x74 > genet_intr() at ampintc_irq_handler+0x14c > ampintc_irq_handler() at arm_cpu_irq+0x30 > arm_cpu_irq() at handle_el1h_irq+0x6c > handle_el1h_irq() at ampintc_splx+0x80 > ampintc_splx() at genet_ioctl+0x158 > genet_ioctl() at ifioctl+0x308 > ifioctl() at nfs_boot_init+0xc0 > nfs_boot_init() at nfs_mountroot+0x3c > nfs_mountroot() at main+0x464 > main() at virtdone+0x70 > > >Fix: > > The mbuf associated with the current index is NULL. I noticed that the NetBSD > driver allocates mbufs for each ring entry in genet_setup_dma(). But even with > that in place the same panic still occurs. Enabling GENET_DEBUG shows that the > total is quite high: > > RX pidx=0000ca07 total=51463 > > > Since it's greater than GENET_DMA_DESC_COUNT (=256) the null dereference will > still happen after doing more than 256 iterations in genet_rxintr() since we > will start accessing mbufs cleared by the previous iteration. > > Here's a diff with what I've tried so far. The KASSERT() is just capturing the > problem at an earlier stage. Any pointers would be much appreciated.
Further digging reveals that writes to GENET_RX_DMA_PROD_INDEX are ignored by the hardware. That's why I ended up with a large amount of mbufs available in genet_rxintr() since the software and hardware state was out of sync. Honoring any existing value makes the problem go away and matches what u-boot[1] does as well. The current RX cidx/pidx defaults in genet_fill_rx_ring() where probably carefully selected as they ensure that the rx ring is filled with at least the configured low watermark number of mbufs. However, instead of being forced to ensure a pidx - cidx delta above 0 on the first invocations of genet_fill_rx_ring(), RX_DESC_COUNT could simply be passed as the max argument to if_rxr_get() which will clamp the value anyway. Also, I've seen up to 8 mbufs being available per rx interrupt which is odd as only a less amount of rx ring entries are actually populated. Not sure if the driver is missing some interrupt threshold configuration. Increasing the rx ring low watermark to 8 "solved" it for now. Otherwise, the same null dereference occurs while trying to access empty mbuf ring entries. Worth mentioning is that the NetBSD driver does not suffer from the same problem as they keep all rx ring entries populated all the time. Looking for feedback and OKs at this point. [1] https://github.com/u-boot/u-boot/blob/a94ab561e2f49a80d8579930e840b810ab1a1330/drivers/net/bcmgenet.c#L404 diff --git sys/dev/ic/bcmgenet.c sys/dev/ic/bcmgenet.c index 923df40bdfc..6070f9d202e 100644 --- sys/dev/ic/bcmgenet.c +++ sys/dev/ic/bcmgenet.c @@ -311,16 +311,13 @@ void genet_fill_rx_ring(struct genet_softc *sc, int qid) { struct mbuf *m; - uint32_t cidx, index, total; + uint32_t cidx, index; u_int slots; int error; cidx = sc->sc_rx.cidx; - total = (sc->sc_rx.pidx - cidx) & 0xffff; - KASSERT(total <= RX_DESC_COUNT); - index = sc->sc_rx.cidx & (RX_DESC_COUNT - 1); - for (slots = if_rxr_get(&sc->sc_rx_ring, total); + for (slots = if_rxr_get(&sc->sc_rx_ring, RX_DESC_COUNT); slots > 0; slots--) { if ((m = genet_alloc_mbufcl(sc)) == NULL) { printf("%s: cannot allocate RX mbuf\n", @@ -517,9 +514,9 @@ genet_init_rings(struct genet_softc *sc, int qid) /* RX ring */ - sc->sc_rx.next = 0; - sc->sc_rx.cidx = 0; - sc->sc_rx.pidx = RX_DESC_COUNT; + sc->sc_rx.cidx = RD4(sc, GENET_RX_DMA_CONS_INDEX(qid)) & 0xffff; + sc->sc_rx.pidx = RD4(sc, GENET_RX_DMA_PROD_INDEX(qid)) & 0xffff; + sc->sc_rx.next = sc->sc_rx.cidx & (RX_DESC_COUNT - 1); WR4(sc, GENET_RX_SCB_BURST_SIZE, 0x08); @@ -543,7 +540,7 @@ genet_init_rings(struct genet_softc *sc, int qid) WR4(sc, GENET_RX_DMA_RING_CFG, __BIT(qid)); /* enable */ - if_rxr_init(&sc->sc_rx_ring, 2, RX_DESC_COUNT); + if_rxr_init(&sc->sc_rx_ring, 8, RX_DESC_COUNT); genet_fill_rx_ring(sc, qid); /* Enable receive DMA */
