On Tue, Mar 24, 2026 at 06:42:31PM +0100, Matias Ezequiel Vara Larsen wrote:
> On Mon, Mar 23, 2026 at 11:36:18AM +0100, Marc Kleine-Budde wrote:
> > Hello,
> > 
> 
> Hello and thanks, I am struggling to come up with a solution for this
> comment.
> 
> > The nipa/ai-review bot found these problems:
> > 
> > | 
> > https://netdev-ai.bots.linux.dev/ai-review.html?id=b046d75a-2de1-4063-bb96-8f64abc65e7d
> > 
> > commit 0f3cb3b3bc94eaff9b9709b528e498fe6150f9c1
> > Author: Matias Ezequiel Vara Larsen <[email protected]>
> > 
> > can: virtio: Add virtio CAN driver
> > 
> > This commit adds a virtio CAN driver implementing a complete CAN bus
> > interface over Virtio transport, supporting both CAN Classic and CAN-FD
> > frames.
> > 
> > > diff --git a/drivers/net/can/virtio_can.c b/drivers/net/can/virtio_can.c
> > > new file mode 100644
> > > index 000000000000..57641c3ed6d0
> > > --- /dev/null
> > > +++ b/drivers/net/can/virtio_can.c
> > 
> > [ ... ]
> > 
> > > +static int __maybe_unused virtio_can_restore(struct virtio_device *vdev)
> > > +{
> > > + struct virtio_can_priv *priv = vdev->priv;
> > > + struct net_device *ndev = priv->dev;
> > > + size_t size;
> > > + int err;
> > > +
> > > + napi_enable(&priv->napi);
> > > + napi_enable(&priv->napi_tx);
> > > +
> > > + err = virtio_can_find_vqs(priv);
> > > + if (err != 0)
> > > +         return err;
> > 
> > What happens to the enabled NAPI instances when this error path is taken?
> > virtio_can_restore() enables both NAPI instances at the start, but if
> > virtio_can_find_vqs() fails here, the function returns without calling
> > napi_disable().
> > 
> I was expecting that, if virtio_can_restore() fails, then a successive
> virtio_can_remove() will not hang in napi_disable() since the napis were
> already enabled.  
> 

Ping @Marc Kleine-Budde, do you have any comment about the comments
above?

Thanks, Matias

> > The corresponding virtio_can_freeze() function disables both NAPIs before
> > returning, so the restore function leaves the NAPI subsystem in an
> > inconsistent state on failure.
> > 
> > If the PM framework retries restore after this failure, the second
> > napi_enable() call on an already-enabled NAPI will hit the BUG_ON in
> > napi_enable_locked():
> >
> 
> Does the PM framework really retries if virtio_can_restore() fails? If not, 
> that BUG_ON
> is never reached.
> 
> > net/core/dev.c:napi_enable_locked() {
> >     BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state));
> >     ...
> > }
> > 
> > In contrast, virtio_can_probe() enables NAPI only after all setup
> > succeeds, avoiding this issue.
> > 
> > > +
> > > + size = priv->rpkt_len * priv->vqs[VIRTIO_CAN_QUEUE_RX]->num_free;
> > > + priv->rpkt = devm_krealloc(&vdev->dev, priv->rpkt, size, GFP_KERNEL | 
> > > __GFP_ZERO);
> > > + if (!priv->rpkt) {
> > > +         virtio_can_del_vq(vdev);
> > > +         return -ENOMEM;
> > > + }
> > 
> > The same NAPI state issue occurs on this error path.
> > 
> > > + virtio_can_populate_rx_vq(vdev);
> > > +
> > > + if (netif_running(ndev)) {
> > > +         err = virtio_can_start(ndev);
> > > +         if (err) {
> > > +                 virtio_can_del_vq(vdev);
> > > +                 return err;
> > > +         }
> > 
> > And again on this error path.
> > 
> > Should all three error paths call napi_disable(&priv->napi_tx) and
> > napi_disable(&priv->napi) before returning the error code?
> > 
> > -- 
> > Pengutronix e.K.                 | Marc Kleine-Budde          |
> > Embedded Linux                   | https://www.pengutronix.de |
> > Vertretung Nürnberg              | Phone: +49-5121-206917-129 |
> > Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-9   |
> 
> 


Reply via email to