On 01/04/2019 10:19, Andrew Cooper wrote: > On 01/04/2019 08:05, Dario Faggioli wrote: >> On Mon, 2019-04-01 at 08:06 +0200, Juergen Gross wrote: >>> On 30/03/2019 11:24, Juergen Gross wrote: >>>> I think its is easier to do it myself, as I'm touching nearly all >>>> of >>>> the call sites anyway. >>> And another thought I had: with RETPOLINE indirect jumps are even >>> more >>> expensive. Would it be a good idea to remove the function pointers >>> from >>> struct scheduler and generate the inline wrappers at build time? >>> >> Yep, I was thinking about doing something like that already, >> independently from this feature/series. >> >> At least something that special case the configured default scheduler, >> and let its hooks be called without indirect jumps (i.e., similarly to >> what's being done in Linux, in quite a few places, these days). >> >>> The >>> wrappers could then call the related specific scheduler function >>> based >>> on the scheduler Id using a chain of if ... else if ... statements. >>> >> I guess we'd have to see how the final code will look, but I like the >> idea, and I think it's well worth a try. > > Jan has a series in progress which does do some manual devirtualisation > across Xen. > > The scheduler is harder though - we've got the default scheduler which > is overwhelmingly likely to be the target of the call, but not always > guaranteed. > > Normally, the result is put together with PGO rather than manually, > because the effects are quite subtle. > > The base case which might be good enough for Xen is: > > if ( sched == default ) > sched_foo(); > else > sched->foo(); > > which for the common case of the default cpupool only, or multiple > groups with the same scheduler, will always take the direct path rather > than the indirect path. > > Beyond that, the best length of the if/else chain can only reasonably be > determined with profiling. It depends on the relative frequencies of > each call, and blindly doing an if/else chain to the end of the > scheduler list will probably make worse performance if you're using the > final scheduler than using a retpoline would. Furthermore, on future > fixed hardware, using indirect calls will become the quicker option again. > > I think its useful to consider optimisations potential optimisations, > but I'd advise against trying to merge everything into this series.
Fine with me. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel