Re: [PATCH] KVM: MMU: speedup update_permission_bitmask

2017-09-12 Thread Peter Feiner
On Tue, Sep 12, 2017 at 12:55 PM, Paolo Bonzini wrote: > On 12/09/2017 18:48, Peter Feiner wrote: Because update_permission_bitmask is actually the top item in the profile for nested vmexits, this speeds up an L2->L1 vmexit by about ten thousand clock cycles, or up to 30%: >> >

Re: [PATCH] KVM: MMU: speedup update_permission_bitmask

2017-09-12 Thread Paolo Bonzini
On 12/09/2017 18:48, Peter Feiner wrote: >>> >>> Because update_permission_bitmask is actually the top item in the profile >>> for nested vmexits, this speeds up an L2->L1 vmexit by about ten thousand >>> clock cycles, or up to 30%: > > This is a great improvement! Why not take it a step further an

Re: [PATCH] KVM: MMU: speedup update_permission_bitmask

2017-09-12 Thread Peter Feiner
On Mon, Aug 28, 2017 at 12:42 PM, Jim Mattson wrote: > > Looks okay to me, but I'm hoping Peter will chime in. Sorry, this slipped by. Busy couple of weeks! > > > Reviewed-by: Jim Mattson > > On Thu, Aug 24, 2017 at 8:56 AM, Paolo Bonzini wrote: > > update_permission_bitmask currently does a 1

Re: [PATCH] KVM: MMU: speedup update_permission_bitmask

2017-09-12 Thread Paolo Bonzini
On 29/08/2017 18:46, David Hildenbrand wrote: >> +#define BYTE_MASK(access) \ >> +((1 & (access) ? 2 : 0) | \ >> + (2 & (access) ? 4 : 0) | \ >> + (3 & (access) ? 8 : 0) | \ >> + (4 & (access) ? 16 : 0) | \ >> + (5 & (access) ? 32 : 0) | \ >> + (6 & (access) ? 64 : 0) | \ >>

Re: [PATCH] KVM: MMU: speedup update_permission_bitmask

2017-08-29 Thread David Hildenbrand
On 24.08.2017 17:56, Paolo Bonzini wrote: > update_permission_bitmask currently does a 128-iteration loop to, > essentially, compute a constant array. Computing the 8 bits in parallel > reduces it to 16 iterations, and is enough to speed it up substantially > because many boolean operations in the

Re: [PATCH] KVM: MMU: speedup update_permission_bitmask

2017-08-28 Thread Jim Mattson
Looks okay to me, but I'm hoping Peter will chime in. Reviewed-by: Jim Mattson On Thu, Aug 24, 2017 at 8:56 AM, Paolo Bonzini wrote: > update_permission_bitmask currently does a 128-iteration loop to, > essentially, compute a constant array. Computing the 8 bits in parallel > reduces it to 16

[PATCH] KVM: MMU: speedup update_permission_bitmask

2017-08-24 Thread Paolo Bonzini
update_permission_bitmask currently does a 128-iteration loop to, essentially, compute a constant array. Computing the 8 bits in parallel reduces it to 16 iterations, and is enough to speed it up substantially because many boolean operations in the inner loop become constants or simplify noticeabl