[dpdk-dev] bifurcated driver

Zhou, Danny danny.zhou at intel.com
Wed Nov 5 23:19:32 CET 2014



From: Alex Markuze [mailto:alex at weka.io]
Sent: Wednesday, November 05, 2014 11:19 PM
To: Thomas Monjalon
Cc: Zhou, Danny; dev at dpdk.org; Fastabend, John R
Subject: Re: [dpdk-dev] bifurcated driver



On Wed, Nov 5, 2014 at 5:14 PM, Alex Markuze <alex at weka.io<mailto:alex at weka.io>> wrote:
On Wed, Nov 5, 2014 at 3:00 PM, Thomas Monjalon <thomas.monjalon at 6wind.com<mailto:thomas.monjalon at 6wind.com>> wrote:
Hi Danny,

2014-10-31 17:36, O'driscoll, Tim:
> Bifurcated Driver (Danny.Zhou at intel.com<mailto:Danny.Zhou at intel.com>)

Thanks for the presentation of bifurcated driver during the community call.
I asked if you looked at ibverbs and you wanted a link to check.
The kernel module is here:
        http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core
The userspace library:
        http://git.kernel.org/cgit/libs/infiniband/libibverbs.git

Extract from Kconfig:
"
config INFINIBAND_USER_ACCESS
        tristate "InfiniBand userspace access (verbs and CM)"
        select ANON_INODES
        ---help---
          Userspace InfiniBand access support.  This enables the
          kernel side of userspace verbs and the userspace
          communication manager (CM).  This allows userspace processes
          to set up connections and directly access InfiniBand
          hardware for fast-path operations.  You will also need
          libibverbs, libibcm and a hardware driver library from
          <http://www.openfabrics.org/git/>.
"

It seems to be close to the bifurcated driver needs.
Not sure if it can solve the security issues if there is no dedicated MMU
in the NIC.

Mellanox NIC's and other  RDMA HW (Infiniband/RoCE/iWARP) have MTT units - memory translation units - a dedicated MMU. These are filled via an ibv_reg_mr sys calls - this creates a Process VM to physical/iova memory mapping in the NIC. Thus each process can access only its own memory via the NIC. This is the way RNIC*s resolve the security issue I'm not sure how standard intel nics could support this scheme.

DZ:  Intel NICs does not provide such a embedded memory translation unit, but Intel chipset supports IOMMU with a generic memory protection mechanism to provide physical/iova memory mapping for DMA transactions on any PCIe device, rather than NIC only.

There is already a 6wind PMD for mellanox Nics. I'm assuming this PMD is verbs based and behaves similar to the bifurcated driver proposed.
http://www.mellanox.com/page/press_release_item?id=979

DZ: is it open sourced for community to use? I guess answer is No. Also, that PMD should have ported majority of Mellanox kernel driver code to DPDK as lots of NIC control related code needed, while the bifurcated driver approach only needs to support minimum Mellanox NIC specific packet rx/tx routines to achieve the DPDK claimed high performance by using all DPDK performance optimization techniques, such as huge page, fixed-size packet buffer, zero-copy, PMD, etc. Kernel driver still remains NIC control, without porting it to DPDK.

One, thing that I don't understand (And will be happy if some one could shed some light on), is how does the NIC supposed do distinguish between packets that need to go to the kernel driver rings and packets going to user space rings.

DZ: it depends on user. User should use standard ethtool (see below examples) to enable flow director and distribute packets to kernel or user space owned rx queue, by specifying 5-tuple as well as destination rxq index. Flow director embedded in NIC does flow classification and distribution, rather than the software approach like DPDK KNI. If you argue SRIOV has similar rx/tx queue pair partition capability, I would say bifurcated driver approach provides much more flexibility than SRIOV, (e.g, variable number of qpairs allocation for user space, L3 5-tuple based flow classification and distribution rather than SRIOV’ L2 classification based on MAC or VLAN)

ethtool -K ethX ntuple on   # enable flow director
ethtool -N ethX flow-type udp4 src-ip 0.0.0.0 action 0   # distribute udp packet wit source IP 0.0.0.0 to rx queue No.0

I feel we should sum up pros and cons of
        - igb_uio
        - uio_pci_generic
        - VFIO
        - ibverbs
        - bifurcated driver
I suggest to consider these criterias:
        - upstream status
        - usable with kernel netdev
        - usable in a vm
        - usable for ethernet
        - hardware requirements
        - security protection
        - performance
Regarding IBVERBS - I'm not sure how its relevant to future DPDK development , but this is the run down as I know It.
 This is a veteran package called OFED , or its counterpart Mellanox OFED.
   ---- The kernel drivers are upstream
   ---- The PCI dev stays in the kernels care trough out its life span
   ---- SRIOV support exists, paravirt support exists only(AFAIK) as an Office of the CTO(VMware) project called vRDMA
   ---- Eth/RoCE (RDMA over Converged Ethernet)/IB
   === HW === RDMA capable HW ONLY.
   ---- Security is designed into RDMA HW
   ---- Stellar performance - Favored by HPC.

*RNIC - RDMA (Remote DMA - iWARP/Infinibad/RoCE)capable NICs.

--
Thomas




More information about the dev mailing list