On 8/14/25 12:03, Roger Pau Monne wrote: > The logic in map_range() will bubble up failures to the upper layer, which > will result in any remaining regions being skip, and for the non-hardware > domain case the owner domain of the device would be destroyed. However for > the hardware domain the intent is to continue execution, hopping the > failure to modify the p2m could be worked around by the hardware domain. > > To accomplish that in a better way, ignore failures and skip the range in > that case, possibly continuing to map further ranges. > > Since the error path in vpci_process_pending() should only be used by domUs > now, and it will unconditionally end up calling domain_crash(), simplify > it: there's no need to cleanup if the domain will be destroyed. > > No functional change for domUs intended. > > Signed-off-by: Roger Pau Monné <roger....@citrix.com> > --- > xen/drivers/vpci/header.c | 51 +++++++++++++++++++++------------------ > 1 file changed, 28 insertions(+), 23 deletions(-) > > diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c > index b9756b364300..1035dcca242d 100644 > --- a/xen/drivers/vpci/header.c > +++ b/xen/drivers/vpci/header.c > @@ -64,7 +64,8 @@ static int cf_check map_range( > printk(XENLOG_G_WARNING > "%pd denied access to MMIO range [%#lx, %#lx]\n", > map->d, map_mfn, m_end); > - return -EPERM; > + rc = -EPERM; > + goto done; > } > > rc = xsm_iomem_mapping(XSM_HOOK, map->d, map_mfn, m_end, map->map); > @@ -73,7 +74,7 @@ static int cf_check map_range( > printk(XENLOG_G_WARNING > "%pd XSM denied access to MMIO range [%#lx, %#lx]: %d\n", > map->d, map_mfn, m_end, rc); > - return rc; > + goto done; > } > > /* > @@ -87,17 +88,27 @@ static int cf_check map_range( > > rc = map->map ? map_mmio_regions(map->d, _gfn(s), size, > _mfn(map_mfn)) > : unmap_mmio_regions(map->d, _gfn(s), size, > _mfn(map_mfn)); > - if ( rc == 0 ) > - { > - *c += size; > - break; > - } > if ( rc < 0 ) > { > printk(XENLOG_G_WARNING > "Failed to %smap [%lx %lx] -> [%lx %lx] for %pd: %d\n", > map->map ? "" : "un", s, e, map_mfn, > map_mfn + size, map->d, rc); > + goto done; > + } > + if ( rc == 0 ) > + { > + done: > + if ( is_hardware_domain(map->d) ) > + { > + /* > + * Ignore failures for the hardware domain and skip the > range. > + * Do it as a best effort workaround to attempt to get the > + * hardware domain to boot. > + */ > + rc = 0;
If we return success and size is zero, we will potentially attempt to map/unmap the same region again in an infinite loop. rangeset_consume_ranges would invoke map_range again directly without returning to vpci_process_pending. > + *c += size; This line is now only executed for hwdom, but ... > + } ... it should go outside of the hwdom check because domUs still need it. > break; > } > ASSERT(rc < size); > @@ -213,28 +224,22 @@ bool vpci_process_pending(struct vcpu *v) > return true; > } > > - if ( rc ) > + if ( rc && !is_hardware_domain(v->domain) ) > { > - spin_lock(&pdev->vpci->lock); > - /* Disable memory decoding unconditionally on failure. */ > - modify_decoding(pdev, v->vpci.cmd & ~PCI_COMMAND_MEMORY, > - false); This path could be taken for either map or unmap. Do we still want to disable memory decoding in case of unmap? > - spin_unlock(&pdev->vpci->lock); > - > - /* Clean all the rangesets */ > - for ( i = 0; i < ARRAY_SIZE(header->bars); i++ ) > - if ( !rangeset_is_empty(header->bars[i].mem) ) > - rangeset_purge(header->bars[i].mem); > - > - v->vpci.pdev = NULL; > - > read_unlock(&v->domain->pci_lock); > > - if ( !is_hardware_domain(v->domain) ) > - domain_crash(v->domain); > + domain_crash(v->domain); > > return false; > } > + ASSERT(!rc); > + /* > + * Purge rangeset to deal with the hardware domain having triggered > an > + * error. It shouldn't be possible, as map_range() will always > shallow > + * errors for hardware domain owned devices, and > + * rangeset_consume_ranges() itself doesn't generate any errors. > + */ > + rangeset_purge(bar->mem); Reiterating what was mentioned above: if map_range returned 0 without incrementing *c, we won't make it back here. > } > v->vpci.pdev = NULL; >