[dpdk-dev] [PATCH 02/10] vdpa/sfc: add support for device initialization

Xia, Chenbo chenbo.xia at intel.com
Mon Sep 6 05:02:16 CEST 2021


Hi,

> -----Original Message-----
> From: Vijay Kumar Srivastava <vsrivast at xilinx.com>
> Sent: Friday, September 3, 2021 9:20 PM
> To: Xia, Chenbo <chenbo.xia at intel.com>; dev at dpdk.org
> Cc: maxime.coquelin at redhat.com; andrew.rybchenko at oktetlabs.ru; Harpreet Singh
> Anand <hanand at xilinx.com>; Praveen Kumar Jain <praveenj at xilinx.com>
> Subject: RE: [PATCH 02/10] vdpa/sfc: add support for device initialization
> 
> 
> Hi Chenbo,
> 
> >-----Original Message-----
> >From: Xia, Chenbo <chenbo.xia at intel.com>
> >Sent: Monday, August 30, 2021 4:22 PM
> >To: Vijay Kumar Srivastava <vsrivast at xilinx.com>; dev at dpdk.org
> >Cc: maxime.coquelin at redhat.com; andrew.rybchenko at oktetlabs.ru; Vijay
> >Kumar Srivastava <vsrivast at xilinx.com>
> >Subject: RE: [PATCH 02/10] vdpa/sfc: add support for device initialization
> >
> >Hi Vijay,
> >
> >> -----Original Message-----
> >> From: Vijay Srivastava <vijay.srivastava at xilinx.com>
> >> Sent: Wednesday, July 7, 2021 12:44 AM
> >> To: dev at dpdk.org
> >> Cc: maxime.coquelin at redhat.com; Xia, Chenbo <chenbo.xia at intel.com>;
> >> andrew.rybchenko at oktetlabs.ru; Vijay Kumar Srivastava
> >> <vsrivast at xilinx.com>
> >> Subject: [PATCH 02/10] vdpa/sfc: add support for device initialization
> >>
> >> From: Vijay Kumar Srivastava <vsrivast at xilinx.com>
> >>
> >> Add HW initialization and vDPA device registration support.
> >>
> >> Signed-off-by: Vijay Kumar Srivastava <vsrivast at xilinx.com>
> >> ---
> 
> [snip]
> 
> >> +sfc_vdpa_dma_alloc(struct sfc_vdpa_adapter *sva, const char *name,
> >> +		   size_t len, efsys_mem_t *esmp)
> >> +{
> >> +	void *mcdi_buf;
> >> +	uint64_t mcdi_iova;
> >> +	size_t mcdi_buff_size;
> >> +	int ret;
> >> +
> >> +	mcdi_buff_size = RTE_ALIGN_CEIL(len, PAGE_SIZE);
> >> +
> >> +	sfc_vdpa_log_init(sva, "name=%s, len=%zu", name, len);
> >> +
> >> +	mcdi_buf = rte_zmalloc(name, mcdi_buff_size, PAGE_SIZE);
> >> +	if (mcdi_buf == NULL) {
> >> +		sfc_vdpa_err(sva, "cannot reserve memory for %s: len=%#x:
> >%s",
> >> +			     name, (unsigned int)len, rte_strerror(rte_errno));
> >> +		return -ENOMEM;
> >> +	}
> >> +
> >> +	/* IOVA address for MCDI would be re-calculated if mapping
> >
> >What is MCDI?
> 
> MCDI is a control interface between driver and firmware.
> It is used by the host drivers to configure the adapter and retrieve status.

Cool, thanks for explanation.

> 
> >> +	 * using default IOVA would fail.
> >> +	 * TODO: Earlier there was no way to get valid IOVA range.
> >> +	 * Recently a patch has been submitted to get the IOVA range
> >> +	 * using ioctl. VFIO_IOMMU_GET_INFO. This patch is available
> >> +	 * in the kernel version >= 5.4. Support to get the default
> >> +	 * IOVA address for MCDI buffer using available IOVA range
> >> +	 * would be added later. Meanwhile default IOVA for MCDI buffer
> >> +	 * is kept at high mem at 2TB. In case of overlap new available
> >> +	 * addresses would be searched and same would be used.
> >> +	 */
> >> +	mcdi_iova = SFC_VDPA_DEFAULT_MCDI_IOVA;
> >> +
> >> +	do {
> >> +		ret = rte_vfio_container_dma_map(sva->vfio_container_fd,
> >> +						 (uint64_t)mcdi_buf,
> >mcdi_iova,
> >> +						 mcdi_buff_size);
> >> +		if (ret == 0)
> >> +			break;
> >> +
> >> +		mcdi_iova = mcdi_iova >> 1;
> >> +		if (mcdi_iova < mcdi_buff_size)	{
> >> +			sfc_vdpa_err(sva,
> >> +				     "DMA mapping failed for MCDI : %s",
> >> +				     rte_strerror(rte_errno));
> >> +			return ret;
> >> +		}
> >> +
> >> +	} while (ret < 0);
> >
> >Is this DMA region for some hardware-specific control msg?
> >
> >And how do you make sure this IOVA space you defined in this driver will not
> >conflict with the IOVA space that vdpa device consumer (Most likely QEMU)
> >defines (If QEMU, IOVA = guest physical address)
> 
> Currently IOVA for MCDI buffer is kept at very high mem at 2TB.

OK. That sounds a work-around to me but we can't make assumption of consumer not
using that address range. And there is a security issue here, please see below
comment.

> 
> To handle IOVA overlap detection scenario a patch is in progress which will be
> submitted soon.
> In that patch, upon IOVA overlap detection new available IOVA would be
> calculated and MCDI buffer would be remapped to new IOVA.

Let's say there is a malicious guest who knows your initial IOVA range that is set
up by your driver (even if it does not know, it can use tests to know. So use static
IOVA range in host is more dangerous). It can use that address in any DMA-able queue
and make DMA into the vdpa app. I think it could cause some security issue as you
let guest easily writing host memory.

For now I don't see a perfect solution except PASID(Process Address Space ID). IIRC,
We could let QEMU have a primary PASID and vdpa app have a secondary PASID so that
VM can't perform DMA to vdpa app. But since it needs HW support and related support
in vfio is not mature, I don't think we are able to use that solution now.

Any solution you can think of for your HW?

Thanks,
Chenbo

> 
> [snip]
> 
> Thanks,
> Vijay



More information about the dev mailing list