On Wed, 14 Sep 2022 13:53:39 -0600 Alex Williamson <alex.william...@redhat.com> wrote:
> On Wed, 14 Sep 2022 12:02:56 -0700 > Nicolin Chen <nicol...@nvidia.com> wrote: > > > Hi Alex, > > > > On Wed, Sep 14, 2022 at 12:10:29PM -0600, Alex Williamson wrote: > > > > > > > > Its caller vfio_connect_container() assigns a default value > > > > > > to info->iova_pgsizes, even if vfio_get_iommu_info() fails. > > > > > > This would result in a "Segmentation fault" error, when the > > > > > > VFIO_IOMMU_GET_INFO ioctl errors out. > > > > > > > > > > > > Since the caller has g_free already, drop the g_free in its > > > > > > rollback routine and add a line of comments to highlight it. > > > > > > > > > > There's basically two ways to fix this: > > > > > > > > > > - return *info in any case, even on error > > > > > - free *info on error, and make sure that the caller doesn't try to > > > > > access *info if the function returned !0 > > > > > > > > > > The problem with the first option is that the caller will access > > > > > invalid > > > > > information if it neglects to check the return code, and that might > > > > > lead > > > > > to not-that-obvious errors; in the second case, a broken caller would > > > > > at > > > > > least fail quickly with a segfault. The current code is easier to fix > > > > > with the first option. > > > > > > > > > > I think I'd prefer the second option; but obviously maintainer's > > > > > choice. > > > > > > > > The caller does check rc all the time. So I made a smaller fix > > > > (the first option). Attaching the git-diff for the second one. > > > > > > > > Alex, please let me know which one you prefer. Thanks! > > > > > I think we can do better than that, I don't think we need to maintain > > > the existing grouping, and that FIXME comment is outdated and has > > > drifted from the relevant line of code. What about: > > > > Your patch looks good and clean to me (some nits inline). > > > > Would you please send and merge it, replacing mine? > > > > > + /* > > > + * Setup defaults for container pgsizes and dma_max_mappings if > > > not > > > + * provided by kernel below. > > > */ > > > > Indentation is misaligned at the first line. > > Oops, will run through checkpatch. > > > > > > + container->pgsizes = 4096; > > > > This might be a separate question/issue: I wonder if we should use > > "sysconf(_SC_PAGE_SIZE)" here instead of 4096. > > > > With a kernel using a larger page size, e.g. CONFIG_ARM64_64K_PAGES, > > the IO page size is likely to be 64K too. If the ioctl fails, this > > default 4K setup won't work. > > Perhaps, but IIRC this solution came about because we originally forgot > to expose the IOMMU_INFO flag to indicate the pgsize field was valid. > At the time we only supported 4K systems, so it made sense to provide > this default, though it is indeed dated. > > TBH, I don't really see why we should try to continue if the ioctl > itself fails, so maybe this should be: > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c > index ace9562a9ba1..ad188b7649e6 100644 > --- a/hw/vfio/common.c > +++ b/hw/vfio/common.c > @@ -2111,29 +2111,31 @@ static int vfio_connect_container(VFIOGroup *group, > AddressSpace *as, > { > struct vfio_iommu_type1_info *info; > > - /* > - * FIXME: This assumes that a Type1 IOMMU can map any 64-bit > - * IOVA whatsoever. That's not actually true, but the current > - * kernel interface doesn't tell us what it can map, and the > - * existing Type1 IOMMUs generally support any IOVA we're > - * going to actually try in practice. > - */ > ret = vfio_get_iommu_info(container, &info); > + if (ret) { Clearly untested, ret = -errno; > + error_setg_errno(errp, -ret, "Failed to get VFIO IOMMU info"); > + goto enable_discards_exit:; > + } > > - if (ret || !(info->flags & VFIO_IOMMU_INFO_PGSIZES)) { > - /* Assume 4k IOVA page size */ > - info->iova_pgsizes = 4096; > + if (info->flags & VFIO_IOMMU_INFO_PGSIZES) { > + container->pgsizes = info->iova_pgsizes; > + } else { > + container->pgsizes = qemu_real_host_page_size(); > } > - vfio_host_win_add(container, 0, (hwaddr)-1, info->iova_pgsizes); > - container->pgsizes = info->iova_pgsizes; > > - /* The default in the kernel ("dma_entry_limit") is 65535. */ > - container->dma_max_mappings = 65535; > - if (!ret) { > - vfio_get_info_dma_avail(info, &container->dma_max_mappings); > - vfio_get_iommu_info_migration(container, info); > + if (!vfio_get_info_dma_avail(info, &container->dma_max_mappings)) { > + container->dma_max_mappings = 65535; > } > + vfio_get_iommu_info_migration(container, info); > g_free(info); > + > + /* > + * FIXME: We should parse VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE > + * information to get the actual window extent rather than assume > + * a 64-bit IOVA address space. > + */ > + vfio_host_win_add(container, 0, (hwaddr)-1, container->pgsizes); > + > break; > } > case VFIO_SPAPR_TCE_v2_IOMMU: