Hi, > -----Original Message----- > From: Vijay Kumar Srivastava <vsriv...@xilinx.com> > Sent: Friday, September 3, 2021 9:20 PM > To: Xia, Chenbo <chenbo....@intel.com>; dev@dpdk.org > Cc: maxime.coque...@redhat.com; andrew.rybche...@oktetlabs.ru; Harpreet Singh > Anand <han...@xilinx.com>; Praveen Kumar Jain <prave...@xilinx.com> > Subject: RE: [PATCH 02/10] vdpa/sfc: add support for device initialization > > > Hi Chenbo, > > >-----Original Message----- > >From: Xia, Chenbo <chenbo....@intel.com> > >Sent: Monday, August 30, 2021 4:22 PM > >To: Vijay Kumar Srivastava <vsriv...@xilinx.com>; dev@dpdk.org > >Cc: maxime.coque...@redhat.com; andrew.rybche...@oktetlabs.ru; Vijay > >Kumar Srivastava <vsriv...@xilinx.com> > >Subject: RE: [PATCH 02/10] vdpa/sfc: add support for device initialization > > > >Hi Vijay, > > > >> -----Original Message----- > >> From: Vijay Srivastava <vijay.srivast...@xilinx.com> > >> Sent: Wednesday, July 7, 2021 12:44 AM > >> To: dev@dpdk.org > >> Cc: maxime.coque...@redhat.com; Xia, Chenbo <chenbo....@intel.com>; > >> andrew.rybche...@oktetlabs.ru; Vijay Kumar Srivastava > >> <vsriv...@xilinx.com> > >> Subject: [PATCH 02/10] vdpa/sfc: add support for device initialization > >> > >> From: Vijay Kumar Srivastava <vsriv...@xilinx.com> > >> > >> Add HW initialization and vDPA device registration support. > >> > >> Signed-off-by: Vijay Kumar Srivastava <vsriv...@xilinx.com> > >> --- > > [snip] > > >> +sfc_vdpa_dma_alloc(struct sfc_vdpa_adapter *sva, const char *name, > >> + size_t len, efsys_mem_t *esmp) > >> +{ > >> + void *mcdi_buf; > >> + uint64_t mcdi_iova; > >> + size_t mcdi_buff_size; > >> + int ret; > >> + > >> + mcdi_buff_size = RTE_ALIGN_CEIL(len, PAGE_SIZE); > >> + > >> + sfc_vdpa_log_init(sva, "name=%s, len=%zu", name, len); > >> + > >> + mcdi_buf = rte_zmalloc(name, mcdi_buff_size, PAGE_SIZE); > >> + if (mcdi_buf == NULL) { > >> + sfc_vdpa_err(sva, "cannot reserve memory for %s: len=%#x: > >%s", > >> + name, (unsigned int)len, rte_strerror(rte_errno)); > >> + return -ENOMEM; > >> + } > >> + > >> + /* IOVA address for MCDI would be re-calculated if mapping > > > >What is MCDI? > > MCDI is a control interface between driver and firmware. > It is used by the host drivers to configure the adapter and retrieve status.
Cool, thanks for explanation. > > >> + * using default IOVA would fail. > >> + * TODO: Earlier there was no way to get valid IOVA range. > >> + * Recently a patch has been submitted to get the IOVA range > >> + * using ioctl. VFIO_IOMMU_GET_INFO. This patch is available > >> + * in the kernel version >= 5.4. Support to get the default > >> + * IOVA address for MCDI buffer using available IOVA range > >> + * would be added later. Meanwhile default IOVA for MCDI buffer > >> + * is kept at high mem at 2TB. In case of overlap new available > >> + * addresses would be searched and same would be used. > >> + */ > >> + mcdi_iova = SFC_VDPA_DEFAULT_MCDI_IOVA; > >> + > >> + do { > >> + ret = rte_vfio_container_dma_map(sva->vfio_container_fd, > >> + (uint64_t)mcdi_buf, > >mcdi_iova, > >> + mcdi_buff_size); > >> + if (ret == 0) > >> + break; > >> + > >> + mcdi_iova = mcdi_iova >> 1; > >> + if (mcdi_iova < mcdi_buff_size) { > >> + sfc_vdpa_err(sva, > >> + "DMA mapping failed for MCDI : %s", > >> + rte_strerror(rte_errno)); > >> + return ret; > >> + } > >> + > >> + } while (ret < 0); > > > >Is this DMA region for some hardware-specific control msg? > > > >And how do you make sure this IOVA space you defined in this driver will not > >conflict with the IOVA space that vdpa device consumer (Most likely QEMU) > >defines (If QEMU, IOVA = guest physical address) > > Currently IOVA for MCDI buffer is kept at very high mem at 2TB. OK. That sounds a work-around to me but we can't make assumption of consumer not using that address range. And there is a security issue here, please see below comment. > > To handle IOVA overlap detection scenario a patch is in progress which will be > submitted soon. > In that patch, upon IOVA overlap detection new available IOVA would be > calculated and MCDI buffer would be remapped to new IOVA. Let's say there is a malicious guest who knows your initial IOVA range that is set up by your driver (even if it does not know, it can use tests to know. So use static IOVA range in host is more dangerous). It can use that address in any DMA-able queue and make DMA into the vdpa app. I think it could cause some security issue as you let guest easily writing host memory. For now I don't see a perfect solution except PASID(Process Address Space ID). IIRC, We could let QEMU have a primary PASID and vdpa app have a secondary PASID so that VM can't perform DMA to vdpa app. But since it needs HW support and related support in vfio is not mature, I don't think we are able to use that solution now. Any solution you can think of for your HW? Thanks, Chenbo > > [snip] > > Thanks, > Vijay