On Mon, Mar 23, 2026 at 11:36:18AM +0100, Marc Kleine-Budde wrote: > Hello, >
Hello and thanks, I am struggling to come up with a solution for this comment. > The nipa/ai-review bot found these problems: > > | > https://netdev-ai.bots.linux.dev/ai-review.html?id=b046d75a-2de1-4063-bb96-8f64abc65e7d > > commit 0f3cb3b3bc94eaff9b9709b528e498fe6150f9c1 > Author: Matias Ezequiel Vara Larsen <[email protected]> > > can: virtio: Add virtio CAN driver > > This commit adds a virtio CAN driver implementing a complete CAN bus > interface over Virtio transport, supporting both CAN Classic and CAN-FD > frames. > > > diff --git a/drivers/net/can/virtio_can.c b/drivers/net/can/virtio_can.c > > new file mode 100644 > > index 000000000000..57641c3ed6d0 > > --- /dev/null > > +++ b/drivers/net/can/virtio_can.c > > [ ... ] > > > +static int __maybe_unused virtio_can_restore(struct virtio_device *vdev) > > +{ > > + struct virtio_can_priv *priv = vdev->priv; > > + struct net_device *ndev = priv->dev; > > + size_t size; > > + int err; > > + > > + napi_enable(&priv->napi); > > + napi_enable(&priv->napi_tx); > > + > > + err = virtio_can_find_vqs(priv); > > + if (err != 0) > > + return err; > > What happens to the enabled NAPI instances when this error path is taken? > virtio_can_restore() enables both NAPI instances at the start, but if > virtio_can_find_vqs() fails here, the function returns without calling > napi_disable(). > I was expecting that, if virtio_can_restore() fails, then a successive virtio_can_remove() will not hang in napi_disable() since the napis were already enabled. > The corresponding virtio_can_freeze() function disables both NAPIs before > returning, so the restore function leaves the NAPI subsystem in an > inconsistent state on failure. > > If the PM framework retries restore after this failure, the second > napi_enable() call on an already-enabled NAPI will hit the BUG_ON in > napi_enable_locked(): > Does the PM framework really retries if virtio_can_restore() fails? If not, that BUG_ON is never reached. > net/core/dev.c:napi_enable_locked() { > BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state)); > ... > } > > In contrast, virtio_can_probe() enables NAPI only after all setup > succeeds, avoiding this issue. > > > + > > + size = priv->rpkt_len * priv->vqs[VIRTIO_CAN_QUEUE_RX]->num_free; > > + priv->rpkt = devm_krealloc(&vdev->dev, priv->rpkt, size, GFP_KERNEL | > > __GFP_ZERO); > > + if (!priv->rpkt) { > > + virtio_can_del_vq(vdev); > > + return -ENOMEM; > > + } > > The same NAPI state issue occurs on this error path. > > > + virtio_can_populate_rx_vq(vdev); > > + > > + if (netif_running(ndev)) { > > + err = virtio_can_start(ndev); > > + if (err) { > > + virtio_can_del_vq(vdev); > > + return err; > > + } > > And again on this error path. > > Should all three error paths call napi_disable(&priv->napi_tx) and > napi_disable(&priv->napi) before returning the error code? > > -- > Pengutronix e.K. | Marc Kleine-Budde | > Embedded Linux | https://www.pengutronix.de | > Vertretung Nürnberg | Phone: +49-5121-206917-129 | > Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-9 |
