On 12/13/2017 10:31 AM, David Hildenbrand wrote: > On 13.12.2017 10:16, Christian Borntraeger wrote: >> >> >> On 12/12/2017 04:28 PM, Cornelia Huck wrote: >>> On Tue, 12 Dec 2017 16:17:17 +0100 >>> David Hildenbrand <da...@redhat.com> wrote: >>> >>>> On 12.12.2017 15:29, Cornelia Huck wrote: >>>>> On Tue, 12 Dec 2017 15:13:46 +0100 >>>>> Christian Borntraeger <borntrae...@de.ibm.com> wrote: >>>>> >>>>>> On 12/12/2017 02:49 PM, Cornelia Huck wrote: >>>>> >>>>>>> One thing I noticed: You removed the caching of the flic (in the old >>>>>>> kvm inject routine), and you generally do more qom invocations (first, >>>>>>> to find the common flic; then, to translate to the qemu or kvm flic). >>>>>>> Not sure if this might be a problem (probably not). >>>>>> >>>>>> Is any of these calls on a potential fast path (e.g. guest without >>>>>> adapter >>>>>> interrupts)? If yes, then QOM is a no-go since it is really slow. >>>>> >>>>> At least the new airq interface was using QOM without caching before. >>>>> >>>>> It's basically about any interrupt; but otoh we are (for kvm) in >>>>> userspace already. Caching the flic and just keeping the casting to the >>>>> specialized flic might be ok (I'd guess that the lookup is the slowest >>>>> path.) >>>>> >>>> >>>> Please note that the lookup is already cached in s390_get_flic(); That >>>> should be sufficient, as it does the expensive lookup. One cache should >>>> be enough, no? >>> >>> Ah, missed that. So the old code actually did double caching... >>> >>>> >>>> The other conversions should be cheap (and already were in place in a >>>> couple of places before). >>> >>> Yes, object_resolve_path() is probably the most expensive one. >>> >>> Did anyone ever check if the (existing) conversions are actually >>> measurable? >> >> Did some quick measurement. >> the initial object resolve takes about 20000ns, the cached ones then is >> 2-5ns. >> (measuring s390_get_flic function). >> >> >> The other conversions (S390FLICStateClass *fsc = >> S390_FLIC_COMMON_GET_CLASS(fs);) >> takes about 15-25ns (up to 100 cycles). >> For irqfd users this does not matter, but things like plan9fs might benefit >> if we speedup this lookup as well? > > Did you also measure the state conversion like QEMU_S390_FLIC() or > KVM_S390_FLIC() ?
somewhat between 4 and 39 ns.