On Wed, Jun 26, 2013 at 5:55 PM, Paolo Bonzini <pbonz...@redhat.com> wrote: > Il 26/06/2013 11:44, liu ping fan ha scritto: >> On Wed, Jun 26, 2013 at 4:38 PM, Paolo Bonzini <pbonz...@redhat.com> wrote: >>> Il 26/06/2013 10:20, liu ping fan ha scritto: >>>>>> On the other hand, pushing _delete() out of finalization path is not >>>>>> easy, since we do not what time the DeviceState has done with its bh. >>>>> >>>>> See above: >>>>> >>>>> - if the BH will run in the iothread, the BH is definitely not running >>>>> (because the BH runs under the BQL, and the reclaimer owns it) >>>> >>>> It works for the case in which the DeviceState's reclaimer calls >>>> _bh_delete(). But in another case(also exist in current code), where >>>> we call _bh_delete() in callback, the bh will be scheduled, and oops! >>> >>> But you may know that the BH is not scheduled after removal, too. >>> >> No, the removal can run in parallel with the other mmio-dispatch which >> can resort to schedule a bh. > > But then behavior is more or less undefined. Of course the device must > ensure that something sane happens, but that's the responsibility of the > device, not of the generic layer. > >> I.e, the removal can not call >> _bh_delete(), so it do not know whether bh will be scheduled or not. > > It can call qemu_bh_delete(), then all concurrent qemu_bh_schedule() > will be no-ops. > Great, with this ability, we can achieve it in a more elegant way -- rcu! BTW, the responsibility is also assigned to guest driver wrt its unplug behavior.
>>> If you look at the memory work, for example, the owner patches happen to >>> be useful for BQL-less dispatch too, but they are solving existing >>> hot-unplug bugs. >>> >> Oh, I will read it again, since I had thought the owner is only used >> for the purpose of refcnt. > > It is. But it goes beyond BQL-less dispatch. Anything that gives away > the BQL between a map and unmap (including asynchronous I/O) needs an > owner. We have ignored that until now because we do not have memory unplug. > >>>>> What we need is separation of removal and reclamation. Without that, >>>>> any effort to make things unplug-safe is going to be way way more >>>>> complicated than necessary. >>>>> >>>> Agree, but when I tried for bh, I found the separation of removal and >>>> reclamation are not easy. We can not _bh_delete() in >>>> acpi_piix_eject_slot(), because mmio-dispatch can schedule a bh at the >>>> same time. >>> >>> acpi_piix_eject_slot() is removal, not reclamation. The plan there is >>> that qdev_free calls the exit callback immediately (which can do >>> qemu_bh_delete), and schedules an unref after the next RCU grace period. >>> At the next RCU grace period the BH will not be running, thus it is >>> safe to finalize the object. >>> >> I have two question: >> 1st, who trigger qdev_free? //Not figuring out the total design, but >> I feel it is quite different from kernel's design, where refcnt->0, >> then exit is called. > > qdev_free is triggered by the guest, but free is a misnomer. It is > really "make it inaccessible from the guest and management" (the kernel > equivalent would be removal of /dev and /sys entries, for example). The > actual "free" will happen later. > Without seeing your detail design, but I suggest that leaving the "exit" as it is, and pick out the inaccessible related code to removal. Finally, when refcnt->0, exit is called, and it play as the final sync point for the remaining access. Regards, Pingfan >> 2nd, _delete_bh() means even there is any not-severed request, the >> request will be canceled. Is it legal, and will not lose data(not >> sure, since I do not know who will call qdev_free)? > > That's something that should be ensured by the device. For example it > is not a problem to cancel virtio-net's tx_bh. If it is a problem, the > device must do something else. It could even be a bdrv_drain_all() in > the worst case. > > Paolo