On Mon, Sep 20, 2010 at 03:50:51PM -0500, Anthony Liguori wrote: > On 09/20/2010 03:37 PM, Michael S. Tsirkin wrote: > >On Mon, Sep 20, 2010 at 03:38:36PM -0500, Anthony Liguori wrote: > >>On 09/20/2010 03:27 PM, Michael S. Tsirkin wrote: > >>>On Mon, Sep 20, 2010 at 03:20:59PM -0500, Anthony Liguori wrote: > >>>>On 09/20/2010 02:44 PM, Michael S. Tsirkin wrote: > >>>>>>I think the only workable approach that doesn't involve new commands > >>>>>>is to change the semantics of the existing ones. > >>>>>> > >>>>>>Make netdev_del work regardless of whether the device is still present. > >>>>>> > >>>>>>You would need to reference count the actual netdev structure and > >>>>>>have each device using it unref on delete. You make netdev_del mark > >>>>>>the device as deleted and when a device is deleted, any calls into > >>>>>>the device effectively become nops. > >>>>>> > >>>>>>You have to go through most of the cleanup process to ensure that > >>>>>>tap device gets closed even before your reference count goes to > >>>>>>zero. > >>>>>I think you mean 'does not get closed': we need the fd to get the flags > >>>>>etc. > >>>>No, I actually meant does get closed. > >>>> > >>>>When you do netdev_del, it should result in the fd getting closed. > >>>> > >>>>The actual netdev structure then becomes a zombie that's completely > >>>>useless until the device goes away. > >>>> > >>>>>Note that it will mostly work unless when it'll crash. > >>>>>Issue is we don't have any documentation so > >>>>>people get the command set by trial and error. > >>>>> > >>>>>So how can we prove it's a user bug and not qemu bug? > >>>>>I guess we should blame ourselves until proven innocent. > >>>>Here's what I'm now suggesting: > >>>> > >>>>device_del -> may or may not unplug a device from a guest when it > >>>>returns. To figure out if it does, you have to run info qdm. > >>>I think it should also always unplug on guest reset. > >>> > >>>>netdev_del -> always destroys a netdev device when it returns. May > >>>>be called at any point in time. If you destroy a netdev while the > >>>>device is still using it, all packets go into the bit bucket and the > >>>>link status is modified to be unplugged. > >>>One issue here is that we can't allow a new device with same name > >>>to be created until the nic is destroyed. > >>A new netdev device? Why not? > >Because it won't work: it will try to pair with existing nic device > >(it is looked up by name) and that will fail. > > No, netdev_del should remove the VLANClientState from the > non_vlan_clients list. > > It's no longer enumerable and it's no longer lookup-able. > > The only reason it stays around it so that the device doesn't have a > reference to a free pointer. The only field that's ever looked at > is is_deleted which is used by every function to turn around and > implement a nop. > > The VLANClientState is a hollow shell of it's former glorious self. > The remainder of it's (hopefully short) life is merely so that we > can avoid touching every device to teach them about disconnecting > backends.
We'll have to tell them link is down, won't we? > >>Because the fundamental problem is that device_del is too difficult > >>to use. You're just making netdev_del equally difficult to use. > >> > >>Try your patch with libvirt, don't load acpiphp in the guest, and > >>then play around with virsh device_detach and device_attach. All > >>sorts of badness will ensue as libvirt tries to manage assignment of > >>PCI slot numbers and such. > >> > >>Regards, > >> > >>Anthony Liguori > >At some level, that's right. Is yours better? > > device_del is still busted but at least netdev_del behaves the way > libvirt expects it to. > > >I guess the right thing is to wait for libvirt guys to tell > >us what they prefer. > > Yeah, I think some clarity is needed. > > Regards, > > Anthony Liguori >