On 04.04.2013, at 14:45, Gleb Natapov wrote: > On Thu, Apr 04, 2013 at 02:39:51PM +0200, Alexander Graf wrote: >> >> On 04.04.2013, at 14:38, Gleb Natapov wrote: >> >>> On Thu, Apr 04, 2013 at 02:32:08PM +0200, Alexander Graf wrote: >>>> >>>> On 04.04.2013, at 14:08, Gleb Natapov wrote: >>>> >>>>> On Thu, Apr 04, 2013 at 01:57:34PM +0200, Alexander Graf wrote: >>>>>> >>>>>> On 04.04.2013, at 12:50, Michael S. Tsirkin wrote: >>>>>> >>>>>>> With KVM, MMIO is much slower than PIO, due to the need to >>>>>>> do page walk and emulation. But with EPT, it does not have to be: we >>>>>>> know the address from the VMCS so if the address is unique, we can look >>>>>>> up the eventfd directly, bypassing emulation. >>>>>>> >>>>>>> Add an interface for userspace to specify this per-address, we can >>>>>>> use this e.g. for virtio. >>>>>>> >>>>>>> The implementation adds a separate bus internally. This serves two >>>>>>> purposes: >>>>>>> - minimize overhead for old userspace that does not use PV MMIO >>>>>>> - minimize disruption in other code (since we don't know the length, >>>>>>> devices on the MMIO bus only get a valid address in write, this >>>>>>> way we don't need to touch all devices to teach them handle >>>>>>> an dinvalid length) >>>>>>> >>>>>>> At the moment, this optimization is only supported for EPT on x86 and >>>>>>> silently ignored for NPT and MMU, so everything works correctly but >>>>>>> slowly. >>>>>>> >>>>>>> TODO: NPT, MMU and non x86 architectures. >>>>>>> >>>>>>> The idea was suggested by Peter Anvin. Lots of thanks to Gleb for >>>>>>> pre-review and suggestions. >>>>>>> >>>>>>> Signed-off-by: Michael S. Tsirkin <m...@redhat.com> >>>>>> >>>>>> This still uses page fault intercepts which are orders of magnitudes >>>>>> slower than hypercalls. Why don't you just create a PV MMIO hypercall >>>>>> that the guest can use to invoke MMIO accesses towards the host based on >>>>>> physical addresses with explicit length encodings? >>>>>> >>>>> It is slower, but not an order of magnitude slower. It become faster >>>>> with newer HW. >>>>> >>>>>> That way you simplify and speed up all code paths, exceeding the speed >>>>>> of PIO exits even. It should also be quite easily portable, as all other >>>>>> platforms have hypercalls available as well. >>>>>> >>>>> We are trying to avoid PV as much as possible (well this is also PV, >>>>> but not guest visible >>>> >>>> Also, how is this not guest visible? Who sets KVM_IOEVENTFD_FLAG_PV_MMIO? >>>> The comment above its definition indicates that the guest does so, so it >>>> is guest visible. >>>> >>> QEMU sets it. >> >> How does QEMU know? >> > Knows what? When to create such eventfd? virtio device knows.
Where does it know from? > >>> >>>> +/* >>>> + * PV_MMIO - Guest can promise us that all accesses touching this address >>>> + * are writes of specified length, starting at the specified address. >>>> + * If not - it's a Guest bug. >>>> + * Can not be used together with either PIO or DATAMATCH. >>>> + */ >>>> >>> Virtio spec will state that access to a kick register needs to be of >>> specific length. This is reasonable thing for HW to ask. >> >> This is a spec change. So the guest would have to indicate that it adheres >> to a newer spec. Thus it's a guest visible change. >> > There is not virtio spec that has kick register in MMIO. The spec is in > the works AFAIK. Actually PIO will not be deprecated and my suggestion So the guest would indicate that it supports a newer revision of the spec (in your case, that it supports MMIO). How is that any different from exposing that it supports a PV MMIO hcall? > is to move to MMIO only when PIO address space is exhausted. For PCI it > will be never, for PCI-e it will be after ~16 devices. Ok, let's go back a step here. Are you actually able to measure any speed in performance with this patch applied and without when going through MMIO kicks? Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/