On Sat, Jun 28, 2025 at 06:36:02PM +0100, Julien Grall wrote:
> Hi,
> 
> On 28/06/2025 14:58, Koichiro Den wrote:
> > On Mon, Jun 23, 2025 at 09:41:47AM +0100, Julien Grall wrote:
> > ---(snip)---
> > > > @@ -707,6 +723,7 @@ int arch_domain_create(struct domain *d,
> > > >                           unsigned int flags)
> > > >    {
> > > >        unsigned int count = 0;
> > > > +    int order;
> > > >        int rc;
> > > >        BUILD_BUG_ON(GUEST_MAX_VCPUS < MAX_VIRT_CPUS);
> > > > @@ -791,6 +808,19 @@ int arch_domain_create(struct domain *d,
> > > >        d->arch.sve_vl = config->arch.sve_vl;
> > > >    #endif
> > > > +    /*
> > > > +     * Preallocate the stolen time shared memory regions for all the
> > > > +     * possible vCPUs.
> > > > +     */
> > > > +    order = get_order_from_bytes(d->max_vcpus * sizeof(struct 
> > > > pv_time_region));
> > > 
> > > As this is an order, we could end up to waste memory fairly quickly. So we
> > > should try to free the unused pages from the order. That said, the maximum
> > > number of virtual CPUs we currently support is 128. If I am not mistaken,
> > > this could fit in 2 4KB pages. So I would also be ok with a
> > > BUILD_BUG_ON(MAX_VIRT_CPUS <= 128) and we defer this work.
> > 
> > I'll go with the former in the next iteration. Thanks!
> > 
> > > 
> > > > +    d->arch.pv_time_regions_gfn = INVALID_GFN;
> > > 
> > > Does this mean PV time is optional? If so, shouldn't we allocate the 
> > > memory
> > > conditionally?
> > > 
> > > Also, looking at the code below, you seem to cater domains created via
> > > dom0less but not from the toolstack. I think both should be supported for
> > > the PV time.
> > 
> > Yes, that was intentional. I should've mentioned that this RFC series only
> > caters the dom0less case. For domains created dynamically via xl create, the
> > only viable solution I've found so far is to reserve PFN range(s) large 
> > enough
> > to cover the maximum possible number of toolstack-created domains on boot,
> > which I think would be too wasteful.
> 
> AFAICT, with current MAX_VIRT_CPUS (128), we would only need to reserve 8KB
> in the guest address space. We still have plenty of space that we can afford
> reserve 8KB (the layout is described in xen/include/public/arch-arm.h). I
> would suggest to allocate the region just before the grant-table (see
> GUEST_GNTTAB_BASE).

That makes sense. So I'm now thinking of adding XENMAPSPACE_pv_time and a
new call site of xc_domain_add_to_physmap() to kick MFN allocations + P2M
setup for PV time shared regions of dynamically created domains.
Thank you for the review.

Koichiro Den

> 
> > In any case, I agree that conditional allocation would be preferable.
> 
> For XL, I would suggest to introduce a field flags in xen_arch_domainconfig
> and use one bit for enabling the PV time interface. The other bits would be
> reserved for the future (so we would need to check they are zeroes). You can
> have a look how "flags" in xen_domctl_createdomain is handled.
> 
> [...]
> 
> > > > diff --git a/xen/arch/arm/include/asm/domain.h 
> > > > b/xen/arch/arm/include/asm/domain.h
> > > > index a3487ca71303..c231c45fe40f 100644
> > > > --- a/xen/arch/arm/include/asm/domain.h
> > > > +++ b/xen/arch/arm/include/asm/domain.h
> > > > @@ -59,6 +59,18 @@ struct paging_domain {
> > > >        unsigned long p2m_total_pages;
> > > >    };
> > > > +/* Stolen time shared memory region (ARM DEN 0057A.b) */
> > > > +struct pv_time_region {
> > > > +    /* This field must be 0 as per ARM DEN 0057A.b */
> > > > +    uint32_t revision;
> > > > +
> > > > +    /* This field must be 0 as per ARM DEN 0057A.b */
> > > > +    uint32_t attribute;
> > > > +
> > > > +    /* Total stolen time in nanoseconds */
> > > > +    uint64_t stolen_time;
> > > > +} __aligned(64);
> > > > +
> > > >    struct arch_domain
> > > >    {
> > > >    #ifdef CONFIG_ARM_64
> > > > @@ -121,6 +133,9 @@ struct arch_domain
> > > >        void *tee;
> > > >    #endif
> > > > +    struct pv_time_region *pv_time_regions;
> > > > +    gfn_t pv_time_regions_gfn;
> > > 
> > > Given the feature is 32-bit specific, shouldn't the field be protected 
> > > with
> > > #define CONFIG_ARM_32?
> > 
> > Is this typo s/32/64/? Assuming so, I'll do so (=protect them with #ifdef
> > CONFIG_ARM_64) in the next iteration. Thanks!
> 
> Yes this is a typo.
> 
> Cheers,
> 
> -- 
> Julien Grall
> 

Reply via email to