> On Jul 17, 2018, at 5:29 PM, Andy Lutomirski <l...@kernel.org> wrote:
> 
> On Tue, Jul 17, 2018 at 1:16 PM, Rik van Riel <r...@surriel.com> wrote:
>> Can I skip both the cr4 and let switches when the TLB contents
>> are no longer valid and got reloaded?
>> 
>> If the TLB contents are still valid, either because we never went
>> into lazy TLB mode, or because no invalidates happened while
>> we were lazy, we immediately return.
>> 
>> The cr4 and ldt reloads only happen if the TLB was invalidated
>> while we were in lazy TLB mode.
> 
> Yes, since the only events that would change the LDT or the required
> CR4 value will unconditionally broadcast to every CPU in mm_cpumask
> regardless of whether they're lazy.  The interesting case is that you
> go lazy, you miss an invalidation IPI because you were lazy, then you
> go unlazy, notice the tlb_gen change, and flush.  If this happens, you
> know that you only missed a page table update and not an LDT update or
> a CR4 update, because the latter would have sent the IPI even though
> you were lazy.  So you should skip the CR4 and LDT updates.
> 
> I suppose a different approach would be to fix the issue below and to
> try to track when the LDT actually needs reloading.  But that latter
> part seems a bit complicated for minimal gain.
> 
> (Do you believe me?  If not, please argue back!)
> 
I believe you :)

>>> Hmm.  load_mm_cr4() should bypass itself when mm == &init_mm.  Want to
>>> fix that part or should I?
>> 
>> I would be happy to send in a patch for this, and one for
>> the above optimization you pointed out.
>> 
> 
> Yes please!
> 
There is a third optimization left to do. Currently every time
we switch into lazy tlb mode, we take a refcount on the mm,
even when switching from one kernel thread to another, or
when repeatedly switching between the same mm and kernel
threads.

We could keep that refcount (on a per cpu basis) from the time
we first switch to that mm in lazy tlb mode, to when we switch
the CPU to a different mm.

That would allow us to not bounce the cache line with the
mm_struct reference count on every lazy TLB context switch.

Does that seem like a reasonable optimization?

Am I overlooking anything?

I'll try to get all three optimizations working, and will run them
through some testing here before posting upstream.

Reply via email to