Re: [Qemu-devel] [RFC v4 1/9] exec.c: Add new exclusive bitmap to ram_list

alvise rigo Wed, 12 Aug 2015 07:06:18 -0700

On Wed, Aug 12, 2015 at 2:36 PM, Paolo Bonzini <pbonz...@redhat.com> wrote:
>
>
> On 12/08/2015 09:31, alvise rigo wrote:
>> I think that tlb_flush_entry is not enough, since in theory another
>> vCPU could have a different TLB address referring the same phys
>> address.
>
> You're right, this is a TLB so it's virtually-indexed. :(  I'm not sure
> what happens on ARM, since it has a virtually indexed (VIVT or VIPT)
> cache, but indeed it would be a problem when implementing e.g. CMPXCHG
> using the TCG ll/sc ops.
>
> I'm a bit worried about adding such a big bitmap.  It's only used on
> TCG, but it is also allocated on KVM and on KVM you can have hundreds
> of VCPUs.  Wasting 200 bits per guest memory page (i.e. ~0.6% of guest
> memory) is obviously not a great idea. :(


I agree, it's a waste of memory.

>
> Perhaps we can use a bytemap instead:
>
> - 0..253 = TLB_EXCL must be set in all VCPUs except CPU n.  A VCPU that
> loads the TLB for this vaddr does not have to set it.
>
> - 254 = TLB_EXCL must be set in all VCPUs.  A VCPU that
> loads the TLB for this vaddr has to set it.
>
> - 255 = TLB_EXCL not set in at least two VCPUs
>
> Transitions:
>
> - ll transitions: anything -> 254
>
> - sc transitions: 254 -> current CPU_ID
>
> - TLB_EXCL store transitions: 254 -> current CPU_ID
>
> - tlb_st_page transitions: CPU_ID other than current -> 255
>
> The initial value is 255 on SMP guests, 0 on UP guests.
>
> The algorithms are very similar to yours, just using this approximate
> representation.
>
> ll algorithm:
>   llsc_value = bytemap[vaddr]
>   if llsc_value == CPU_ID
>      do nothing
>   elseif llsc_value < 254
>      flush TLB of CPU llsc_value
>   elseif llsc_value == 255
>      flush all TLBs
>   set TLB_EXCL
>   bytemap[vaddr] = 254
>   load
>
> tlb_set_page algorithm:
>   llsc_value = bytemap[vaddr]
>   if llsc_value == CPU_ID or llsc_value == 255
>      do nothing
>   else if llsc_value == 254
>      set TLB_EXCL
>   else
>      # two CPUs without TLB_EXCL
>      bytemap[vaddr] = 255
>
> TLB_EXCL slow path algorithm:
>    if bytemap[vaddr] == 254
>       bytemap[vaddr] = CPU_ID
>    else
>       # two CPUs without TLB_EXCL
>       bytemap[vaddr] = 255
>    clear TLB_EXCL in this CPU
>    store
>
> sc algorithm:
>    if bytemap[vaddr] == CPU_ID or bytemap[vaddr] == 254
>       bytemap[vaddr] = CPU_ID
>       clear TLB_EXCL in this CPU
>       store
>       succeed
>    else
>       fail
>
> clear algorithm:
>    if bytemap[vaddr] == 254
>       bytemap[vaddr] = CPU_ID

Isn't this also required for the clear algorithm?

    if bytemap[vaddr] < 254
        /* this can happen for the TLB_EXCL slow path effect */
        bytemap[vaddr] = 255

The whole idea makes sense, I will consider it for the next iteration
of the patches.

Thanks,
alvise

>
> The UP case is optimized because bytemap[vaddr] will always be 0 or 254.
>
> The algorithm requires the LL to be cleared e.g. on exceptions.
> Paolo
>
>> alvise
>>
>> On Tue, Aug 11, 2015 at 6:32 PM, Paolo Bonzini <pbonz...@redhat.com> wrote:
>>>
>>>
>>> On 11/08/2015 18:11, alvise rigo wrote:
>>>>>> Why flush the entire cache (I understand you mean TLB)?
>>>> Sorry, I meant the TLB.
>>>> If for each removal of an exclusive entry we set also the bit to 1, we
>>>> force the following LL to make a tlb_flush() on every vCPU.
>>>
>>> What if you only flush one entry with tlb_flush_entry (on every vCPU)?
>>>
>>> Paolo

Re: [Qemu-devel] [RFC v4 1/9] exec.c: Add new exclusive bitmap to ram_list

Reply via email to