On 24.06.2025 15:47, Oleksii Kurochko wrote:
> On 6/24/25 12:44 PM, Jan Beulich wrote:
>> On 24.06.2025 11:46, Oleksii Kurochko wrote:
>>> On 6/18/25 5:46 PM, Jan Beulich wrote:
>>>> On 10.06.2025 15:05, Oleksii Kurochko wrote:
>>>>> --- /dev/null
>>>>> +++ b/xen/arch/riscv/p2m.c
>>>>> @@ -0,0 +1,115 @@
>>>>> +#include <xen/bitops.h>
>>>>> +#include <xen/lib.h>
>>>>> +#include <xen/sched.h>
>>>>> +#include <xen/spinlock.h>
>>>>> +#include <xen/xvmalloc.h>
>>>>> +
>>>>> +#include <asm/p2m.h>
>>>>> +#include <asm/sbi.h>
>>>>> +
>>>>> +static spinlock_t vmid_alloc_lock = SPIN_LOCK_UNLOCKED;
>>>>> +
>>>>> +/*
>>>>> + * hgatp's VMID field is 7 or 14 bits. RV64 may support 14-bit VMID.
>>>>> + * Using a bitmap here limits us to 127 (2^7 - 1) or 16383 (2^14 - 1)
>>>>> + * concurrent domains.
>>>> Which is pretty limiting especially in the RV32 case. Hence why we don't
>>>> assign a permanent ID to VMs on x86, but rather manage IDs per-CPU (note:
>>>> not per-vCPU).
>>> Good point.
>>>
>>> I don't believe anyone will use RV32.
>>> For RV64, the available ID space seems sufficiently large.
>>>
>>> However, if it turns out that the value isn't large enough even for RV64,
>>> I can rework it to manage IDs per physical CPU.
>>> Wouldn't that approach result in more TLB entries being flushed compared
>>> to per-vCPU allocation, potentially leading to slightly worse performance?
>> Depends on the condition for when to flush. Of course performance is
>> unavoidably going to suffer if you have only very few VMIDs to use.
>> Nevertheless, as indicated before, the model used on x86 may be a
>> candidate to use here, too. See hvm_asid_handle_vmenter() for the
>> core (and vendor-independent) part of it.
> 
> IIUC, so basically it is just a round-robin and when VMIDs are ran out
> then just do full guest TLB flush and start to re-use VMIDs from the start.
> It makes sense to me, I'll implement something similar. (as I'm not really
> sure that we needdata->core_asid_generation, probably, I will understand it 
> better when 
> start to implement it)

Well. The fewer VMID bits you have the more quickly you will need a new
generation. And keep track of the generation you're at you also need to
track the present number somewhere.

>>> What about then to allocate VMID per-domain?
>> That's what you're doing right now, isn't it? And that gets problematic when
>> you have only very few bits in hgatp.VMID, as mentioned below.
> 
> Right, I just phrased my question poorly—sorry about that.
> 
> What I meant to ask is: does the approach described above actually depend on 
> whether
> VMIDs are allocated per-domain or per-pCPU? It seems that the main advantage 
> of
> allocating VMIDs per-pCPU is potentially reducing the number of TLB flushes,
> since it's more likely that a platform will have more than|VMID_MAX| domains 
> than
> |VMID_MAX| physical CPUs—am I right?

Seeing that there can be systems with hundreds or even thousands of CPUs,
I don't think I can agree here. Plus per-pCPU allocation would similarly
get you in trouble when you have only very few VMID bits.

>>>>> +        sbi_remote_hfence_gvma_vmid(d->dirty_cpumask, 0, 0, p2m->vmid);
>>>> You're creating d; it cannot possibly have run on any CPU yet. IOW
>>>> d->dirty_cpumask will be reliably empty here. I think it would be hard to
>>>> avoid issuing the flush to all CPUs here in this scheme.
>>> I didn't double check, but I was sure that in case d->dirty_cpumask is 
>>> empty then
>>> rfence for all CPUs will be send. But I was wrong about that.
>>>
>>> What about just update a code of sbi_rfence_v02()?
>> I don't know, but dealing with the issue there feels wrong. However,
>> before deciding where to do something, it needs to be clear what you
>> actually want to achieve. To me at least, that's not clear at all.
> 
> I want to achieve the following behavior: if a mask is empty
> (specifically, in our case|d->dirty_cpumask|), then perform the flush
> on all CPUs.

That's still too far into the "how". The "why" here is still unclear: Why
do you need any flushing here at all? (With the scheme you now mean to
implement I expect it'll become yet more clear that no flush is needed
during domain construction.)

Jan

Reply via email to