On Thu, May 28, 2020 at 07:54:35PM +0100, Julien Grall wrote:
> Hi Bertrand,
> 
> Thank you for the patch.
> 
> On 28/05/2020 16:25, Bertrand Marquis wrote:
> > +static int map_runstate_area(struct vcpu *v,
> > +                    struct vcpu_register_runstate_memory_area *area)
> > +{
> > +    unsigned long offset = area->addr.p & ~PAGE_MASK;
> > +    void *mapping;
> > +    struct page_info *page;
> > +    size_t size = sizeof(struct vcpu_runstate_info);
> > +
> > +    ASSERT(runstate_guest(v) == NULL);
> > +
> > +    /* do not allow an area crossing 2 pages */
> > +    if ( offset > (PAGE_SIZE - size) )
> > +        return -EINVAL;
> 
> This is a change in behavior for the guest. If we are going forward with
> this, this will want a separate patch with its own explanation why this is
> done.

I don't think we can go this route without supporting crossing a page
boundary.

Linux will BUG if VCPUOP_register_runstate_memory_area fails, and
AFAICT there's no check in Linux to assure the runstate area doesn't
cross a page boundary. If we want to go this route we must support the
area crossing a page boundary, or else we will break existing
guests.

> > +
> > +#ifdef CONFIG_ARM
> > +    page = get_page_from_gva(v, area->addr.p, GV2M_WRITE);
> 
> A guest is allowed to setup the runstate for a different vCPU than the
> current one. This will lead to get_page_from_gva() to fail as the function
> cannot yet work with a vCPU other than current.
> 
> AFAICT, there is no restriction on when the runstate hypercall can be
> called. So this can even be called before the vCPU is brought up.
> 
> I was going to suggest to use the current vCPU for translating the address.
> However, it would be reasonable for an OS to use the same virtual address
> for all the vCPUs assuming the page-tables are different per vCPU.

Hm, it's a tricky question. Using the current vCPU page tables would
seem like a good compromise, but it needs to get added to the header
as a note, and this should ideally be merged at the start of a
development cycle to get people time to test and report issues.

> Recent Linux are using a per-cpu area, so the virtual address should be
> different for each vCPU. But I don't know how the other OSes works. Roger
> should be able to help for FreeBSD at least.

FreeBSD doesn't use VCPUOP_register_runstate_memory_area at all, so we
are safe in that regard.

I never got around to implementing the required scheduler changes in
order to support stolen time accounting.  Note sure this has changed
since I last checked, the bhyve and KVM guys also had interest in
properly accounting for stolen time on FreeBSD IIRC.

Roger.

Reply via email to