Hi Anatoly, > -----Original Message----- > From: Burakov, Anatoly > Sent: Monday, April 16, 2018 6:03 PM > To: Wang, Xiao W <xiao.w.w...@intel.com>; Yigit, Ferruh > <ferruh.yi...@intel.com> > Cc: dev@dpdk.org; maxime.coque...@redhat.com; Wang, Zhihong > <zhihong.w...@intel.com>; Bie, Tiwei <tiwei....@intel.com>; Tan, Jianfeng > <jianfeng....@intel.com>; Liang, Cunming <cunming.li...@intel.com>; Daly, > Dan <dan.d...@intel.com>; tho...@monjalon.net; Chen, Junjie J > <junjie.j.c...@intel.com> > Subject: Re: [PATCH v7 2/5] vfio: add multi container support > > On 15-Apr-18 4:33 PM, Xiao Wang wrote: > > This patch adds APIs to support container create/destroy and device > > bind/unbind with a container. It also provides API for IOMMU programing > > on a specified container. > > > > A driver could use "rte_vfio_create_container" helper to create a > > ^^ wrong API name in commit message :)
Thanks for the catch. Will fix it. > > > new container from eal, use "rte_vfio_bind_group" to bind a device > > to the newly created container. During rte_vfio_setup_device the > > container bound with the device will be used for IOMMU setup. > > > > Signed-off-by: Junjie Chen <junjie.j.c...@intel.com> > > Signed-off-by: Xiao Wang <xiao.w.w...@intel.com> > > Reviewed-by: Maxime Coquelin <maxime.coque...@redhat.com> > > Reviewed-by: Ferruh Yigit <ferruh.yi...@intel.com> > > --- > > lib/librte_eal/bsdapp/eal/eal.c | 52 +++++ > > lib/librte_eal/common/include/rte_vfio.h | 119 ++++++++++++ > > lib/librte_eal/linuxapp/eal/eal_vfio.c | 316 > +++++++++++++++++++++++++++++++ > > lib/librte_eal/rte_eal_version.map | 6 + > > 4 files changed, 493 insertions(+) > > > > diff --git a/lib/librte_eal/bsdapp/eal/eal.c > > b/lib/librte_eal/bsdapp/eal/eal.c > > index 727adc5d2..c5106d0d6 100644 > > --- a/lib/librte_eal/bsdapp/eal/eal.c > > +++ b/lib/librte_eal/bsdapp/eal/eal.c > > @@ -769,6 +769,14 @@ int rte_vfio_noiommu_is_enabled(void); > > int rte_vfio_clear_group(int vfio_group_fd); > > int rte_vfio_dma_map(uint64_t vaddr, uint64_t iova, uint64_t len); > > int rte_vfio_dma_unmap(uint64_t vaddr, uint64_t iova, uint64_t len); > > +int rte_vfio_container_create(void); > > +int rte_vfio_container_destroy(int container_fd); > > +int rte_vfio_bind_group(int container_fd, int iommu_group_no); > > +int rte_vfio_unbind_group(int container_fd, int iommu_group_no); > > Maybe have these under "container" too? e.g. > rte_vfio_container_group_bind/unbind? Seems like it would be more > consistent that way - anything to do with custom containers would be > under rte_vfio_container_* namespace. Agree. > > > +int rte_vfio_container_dma_map(int container_fd, uint64_t vaddr, > > + uint64_t iova, uint64_t len); > > +int rte_vfio_container_dma_unmap(int container_fd, uint64_t vaddr, > > + uint64_t iova, uint64_t len); > > > > int rte_vfio_setup_device(__rte_unused const char *sysfs_base, > > __rte_unused const char *dev_addr, > > @@ -818,3 +826,47 @@ rte_vfio_dma_unmap(uint64_t __rte_unused vaddr, > uint64_t __rte_unused iova, > > { > > return -1; > > } > > + > > <...> > > > diff --git a/lib/librte_eal/common/include/rte_vfio.h > b/lib/librte_eal/common/include/rte_vfio.h > > index d26ab01cb..0c1509b29 100644 > > --- a/lib/librte_eal/common/include/rte_vfio.h > > +++ b/lib/librte_eal/common/include/rte_vfio.h > > @@ -168,6 +168,125 @@ rte_vfio_dma_map(uint64_t vaddr, uint64_t iova, > uint64_t len); > > int __rte_experimental > > rte_vfio_dma_unmap(uint64_t vaddr, uint64_t iova, uint64_t len); > > > > +/** > > + * @warning > > + * @b EXPERIMENTAL: this API may change, or be removed, without prior > notice > > + * > > + * Create a new container for device binding. > > I would add a note that any newly allocated DPDK memory will not be > mapped into these containers by default. Will add it. > > > + * > > + * @return > > + * the container fd if successful > > + * <0 if failed > > + */ > > +int __rte_experimental > > +rte_vfio_container_create(void); > > + > > <...> > > > + * 0 if successful > > + * <0 if failed > > + */ > > +int __rte_experimental > > +rte_vfio_unbind_group(int container_fd, int iommu_group_no); > > + > > +/** > > + * @warning > > + * @b EXPERIMENTAL: this API may change, or be removed, without prior > notice > > + * > > + * Perform dma mapping for devices in a conainer. > > Here and in other places: "dma" should be DMA, and typo: "conainer" :) > > I think you should also add a note to the original API (not this one, > but the old one) that DMA maps done via that API will only apply to > default container and will not apply to any of the containers created > via container_create(). IOW, documentation should make it clear that if > you use this functionality, you're on your own and you have to manage > your own DMA mappings for any containers you create. OK, will add note to clearly describe it. > > > + * > > + * @param container_fd > > + * the specified container fd > > + * > > + * @param vaddr > > + * Starting virtual address of memory to be mapped. > > + * > > <...> > > > + > > +int __rte_experimental > > +rte_vfio_container_dma_map(int container_fd, uint64_t vaddr, uint64_t > iova, > > + uint64_t len) > > +{ > > + struct user_mem_map *new_map; > > + struct vfio_config *vfio_cfg; > > + struct user_mem_maps *user_mem_maps; > > + int ret = 0; > > + > > + if (len == 0) { > > + rte_errno = EINVAL; > > + return -1; > > + } > > + > > + vfio_cfg = get_vfio_cfg_by_container_fd(container_fd); > > + if (vfio_cfg == NULL) { > > + RTE_LOG(ERR, EAL, "Invalid container fd\n"); > > + return -1; > > + } > > + > > + user_mem_maps = &vfio_cfg->mem_maps; > > + rte_spinlock_recursive_lock(&user_mem_maps->lock); > > + if (user_mem_maps->n_maps == VFIO_MAX_USER_MEM_MAPS) { > > + RTE_LOG(ERR, EAL, "No more space for user mem maps\n"); > > + rte_errno = ENOMEM; > > + ret = -1; > > + goto out; > > + } > > + /* map the entry */ > > + if (vfio_dma_mem_map(vfio_cfg, vaddr, iova, len, 1)) { > > + /* technically, this will fail if there are currently no devices > > + * plugged in, even if a device were added later, this mapping > > + * might have succeeded. however, since we cannot verify if > this > > + * is a valid mapping without having a device attached, > consider > > + * this to be unsupported, because we can't just store any old > > + * mapping and pollute list of active mappings willy-nilly. > > + */ > > + RTE_LOG(ERR, EAL, "Couldn't map new region for DMA\n"); > > + ret = -1; > > + goto out; > > + } > > + /* create new user mem map entry */ > > + new_map = &user_mem_maps->maps[user_mem_maps->n_maps++]; > > + new_map->addr = vaddr; > > + new_map->iova = iova; > > + new_map->len = len; > > + > > + compact_user_maps(user_mem_maps); > > +out: > > + rte_spinlock_recursive_unlock(&user_mem_maps->lock); > > + return ret; > > Please correct me if i'm wrong, but it looks like you've just duplicated > the code for rte_vfio_dma_map() here and made a few small changes. It > would be better if you moved most of this into a static function (e.g. > static int container_dma_map(vfio_cfg, vaddr, iova, len)) and called it > with either default vfio_cfg from rte_vfio_dma_map, or found vfio_cfg > from rte_vfio_container_dma_map. Same applies to function below. Agree, will do it in v8. BRs, Xiao > > > +} > > + > > +int __rte_experimental > > +rte_vfio_container_dma_unmap(int container_fd, uint64_t vaddr, uint64_t > iova, > > + uint64_t len) > > +{ > > + struct user_mem_map *map, *new_map = NULL; > > + struct vfio_config *vfio_cfg; > > + struct user_mem_maps *user_mem_maps; > > + int ret = 0; > > + > > + if (len == 0) { > > + rte_errno = EINVAL; > > + return -1; > > + } > > + > > <...> > > -- > Thanks, > Anatoly