[dpdk-dev] Having troubles binding an SR-IOV VF to uio_pci_generic on Amazon instance

Alexander Duyck alexander.duyck at gmail.com
Thu Oct 1 23:02:24 CEST 2015


On 10/01/2015 06:14 AM, Michael S. Tsirkin wrote:
> On Thu, Oct 01, 2015 at 01:07:13PM +0100, Bruce Richardson wrote:
>>>> This in itself is going to use up
>>>> a good proportion of the processing time, as well as that we have to spend cycles
>>>> copying the descriptors from one ring in memory to another. Given that right now
>>>> with the vector ixgbe driver, the cycle cost per packet of RX is just a few dozen
>>>> cycles on modern cores, every additional cycle (fraction of a nanosecond) has
>>>> an impact.
>>>>
>>>> Regards,
>>>> /Bruce
>>> See above.  There is no need for that on data path. Only re-adding
>>> buffers requires a system call.
>>>
>> Re-adding buffers is a key part of the data path! Ok, the fact that its only on
>> descriptor rearm does allow somewhat bigger batches,
> That was the point, yes.
>
>> but the whole point of having
>> the kernel do this extra work you propose is to allow the kernel to scan and
>> sanitize the physical addresses - and that will take a lot of cycles, especially
>> if it has to handle all the different descriptor formats of all the different NICs,
>> as has already been pointed out.
>>
>> /Bruce
> Well the driver would be per NIC, so there's only need to support
> specific formats supported by a given NIC.

One thing that seems to be overlooked in your discussion is the cost to 
translate these descriptors.  It isn't as if most systems running DPDK 
have the cycles to spare.  As I believe was brought up in another thread 
we are looking at a budget of something like 68ns of 10Gbps line rate.  
The overhead for having to go through and translate/parse/validate the 
descriptors would end up being pretty significant.  If you need proof of 
that just try running the ixgbe driver and route small packets.  We end 
up spending something like 40ns in ixgbe_clean_rx_irq and that is mostly 
just translating the descriptor bits into the correct sk_buff bits.  
Also trying to maintain a user-space ring in addition to the 
kernel-space ring means that much more memory overhead and increasing 
the liklihood of things getting pushed out of the L1 cache.

As far as the descriptor validation itself the overhead for that would 
guarantee that you cannot get any performance out of the device.  There 
are too many corner cases that would have to be addressed in validating 
user-space input to allow for us to process packets in any sort of 
timely fashion.  For starters we would have to validate the size, 
alignment, and ownership of a given buffer. If it is a transmit buffer 
we have to go through and validate any offloads being requested.  Likely 
just the validation and translation would add 10s if not 100s of 
nanoseconds to the time needed to process each packet.  In addition we 
are talking about doing this in kernel space which means we wouldn't 
really be able to take advantage of things like SSE or AVX instructions.

> An alternative is to format the descriptors in kernel, based
> on just the list of addresses. This seems cleaner, but I don't
> know how efficient it would be.
>
> Device vendors and dpdk developers are probably the best people to
> figure out what's the best thing to do here.

As far as the bifurcated driver approach the only way something like 
that would ever work is if you could limit the access via an IOMMU. At 
least everything I have seen proposed for a bifurcated driver still 
involved one if they expected to get any performance.

> But it looks like it's not going to happen unless security is made
> a requirement for upstreaming code.

The fact is we already ship uio_pci_generic.  User space drivers are 
here to stay.  What is being asked for is an extension to the existing 
infrastructure to allow MSI-X interrupts to trigger an event on a file 
descriptor.  As far as I know that doesn't add any additional security 
risk since it is the kernel PCIe subsystem itself that would be 
programming the address and data for said device, it wouldn't actually 
grant any more access other then the additional file descriptors to 
support MSI-X vectors.

Anyway that is just my $.02 on this.

- Alex




More information about the dev mailing list