Re: [Qemu-devel] x86 segment limits enforcement with TCG

Stephen Checkoway Thu, 28 Feb 2019 09:28:00 -0800

This is all extremely helpful! I'll dig in and try this approach soon.

> On Feb 28, 2019, at 11:11, Richard Henderson <richard.hender...@linaro.org> 
> wrote:
> 
>> Are you thinking that this should be modeled as independent sets of TLBs, 
>> one per mode?
> 
> One per segment you mean?


Yes.

>  Yes, exactly.  Since each segment can have
> independent segment base + limit + permissions.  All of which would be taken
> into account by tlb_fill when populating the TLB.
> 
>> It seems easier to have a linear address MMU mode and then for the MMU modes
>> corresponding to segment registers, perform an access and limit check,
>> adjust the address by the segment base, and then go through the linear
>> address MMU mode translation.
> Except you need to generate extra calls at runtime to perform this 
> translation,
> and you are not able to cache the result of the lookup against a second access
> to the same page.

I see. That makes sense. I didn't realize the results of the calls were being 
cached.

> 
>> In particular, code that uses segments spends a lot of time changing the
>> values of segment registers. E.g., in the movs example above, the ds segment
>> may be overridden but the es segment cannot be, so to use the string move
>> instructions within ds, es needs to be saved, modified, and then restored.
> You are correct that this would result in two TLB flushes.
> 
> But if MOVS executes a non-trivial number of iterations, we still may win.
> 
> The work that Emilio Cota has done in this development cycle to make the size
> of the softmmu TLBs dynamic will help here.  It may well be that MOVS is used
> with small memcpy, and there are a fair few flushes.  But in that case the TLB
> will be kept very small, and so the flush will not be expensive.

I wonder if it would make sense to maintain a small cache of TLBs. The majority 
of cases are likely to involving setting segment registers to one of a handful 
of segments (e.g., setting es to ds or ss). So it might be nice to avoid the 
flushes entirely.

> On the other hand, DS changes are rare (depending on the programming model),
> and SS changes only on context switches.  Their TLBs will keep their contents,
> even while ES gets flushed.  Work has been saved over adding explicit calls to
> a linear address helper function.

In my case, ds changes are pretty frequent—I count 75 instances of mov ds, __ 
and 124 instances of pop ds—in the executive (ring 0) portion of this firmware. 
Obviously the dynamic count is more interesting, but I don't have that off-hand.

> The vast majority of x86 instructions have exactly one memory access, and it
> uses the default segment (ds/ss) or the segment override.  We can set this
> default mmu index as soon as we have seen any segment override.
> 
>> Returning to the movs example, the order of operations _must_ be
>> 1. lea ds:[esi]
>> 2. load 4 bytes
>> 3. lea es:[edi]
>> 4. store 4 bytes
> 
> MOVS is one of the rare examples of two memory accesses within one 
> instruction.
> Yes, we would have to special case this, and be careful to get everything 
> right.

I agree that the vast majority of x86 instructions access at most one segment, 
but off-hand, I can think of a handful that access two:

- movs 
- cmps
- push r/m32
- pop r/m32
- call m32
- call m16:m32

I'm not sure if there are others.

-- 
Stephen Checkoway

Re: [Qemu-devel] x86 segment limits enforcement with TCG

Reply via email to