On 14/11/2025 4:32 pm, Teddy Astie wrote: > A 4K page appears to be able to hold 128 ioreq entries, which luckly > matches the current vCPU limit. However, if we decide to increase the > vCPU limit, that doesn't hold anymore and this function would now > silently fetch a out of bounds pointer. > > All architectures have no more than 128 as vCPU limit on HVM guests, > and have pages that are at most 4 KB, so this case doesn't occurs in > with the current limits. > > Make sure that out of bounds attempts are reported and adjust the around > logic to at worst crash the offending domain instead. > > No functional change. > > Reported-by: Julian Vetter <[email protected]> > Signed-off-by: Teddy Astie <[email protected]> > --- > v2: > - check and report instead of ASSERT and eventually crash offending domain > > xen/common/ioreq.c | 27 ++++++++++++++++++++++++--- > 1 file changed, 24 insertions(+), 3 deletions(-) > > diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c > index f5fd30ce12..a2a2dafe85 100644 > --- a/xen/common/ioreq.c > +++ b/xen/common/ioreq.c > @@ -100,7 +100,14 @@ static ioreq_t *get_ioreq(struct ioreq_server *s, struct > vcpu *v) > ASSERT((v == current) || !vcpu_runnable(v)); > ASSERT(p != NULL); > > - return &p->vcpu_ioreq[v->vcpu_id]; > + if ( likely(v->vcpu_id < (PAGE_SIZE / sizeof(struct ioreq))) ) > + return &p->vcpu_ioreq[v->vcpu_id]; > + else > + { > + gprintk(XENLOG_ERR, "Out of bounds vCPU %pv in ioreq server\n", v); > + WARN(); > + return NULL; > + } > }
While I appreciate this is fixing a latent bug, it's very invasive *and* needs removing as part of changing the 128 limit. (i.e. its a lot of churn for no benefit in the meantime.) Xen cannot release in a position where the user can request 130 CPUs but explodes like this on the top two. Furthermore, the WARN() isn't ratelimited, yet this is guest-triggerable. What we need to do is borrow WARN_ON_ONCE() from Linux for cases like this, where it's obvious in the logs but can't be triggered repeatedly. ~Andrew
