[dpdk-dev] [RFC PATCHv2 0/2] pktdev as wrapper type

Marc Sune marc.sune at bisdn.de
Wed May 20 19:01:02 CEST 2015



On 20/05/15 12:28, Neil Horman wrote:
> On Wed, May 20, 2015 at 12:05:00PM +0200, Marc Sune wrote:
>>
>> On 20/05/15 10:31, Thomas Monjalon wrote:
>>> 2015-05-19 12:31, Bruce Richardson:
>>>> On Mon, May 11, 2015 at 05:29:39PM +0100, Bruce Richardson wrote:
>>>>> Hi all,
>>>>>
>>>>> after a small amount of offline discussion with Marc Sune, here is an
>>>>> alternative proposal for a higher-level interface - aka pktdev - to allow a
>>>>> common Rx/Tx API across device types handling mbufs [for now, ethdev, ring
>>>>> and KNI]. The key code is in the first patch fo the set - the second is an
>>>>> example of a trivial usecase.
>>>>>
>>>>> What is different about this to previously:
>>>>> * wrapper class, so no changes to any existing ring, ethdev implementations
>>>>> * use of function pointers for RX/TX with an API that maps to ethdev
>>>>>    - this means there is little/no additional overhead for ethdev calls
>>>>>    - inline special case for rings, to accelerate that. Since we are at a
>>>>>      higher level, we can special case process some things if appropriate. This
>>>>>      means the impact to ring ops is one (predictable) branch per burst
>>>>> * elimination of the queue abstraction. For the ring and KNI, there is no
>>>>>    concept of queues, so we just wrap the functions directly (no need even for
>>>>>    wrapper functions, the api's match so we can call directly). This also
>>>>>    means:
>>>>>    - adding in features per-queue, is far easier as we don't need to worry about
>>>>>      having arrays of multiple queues. For example:
>>>>>    - adding in buffering on TX (or RX) is easier since again we only have a
>>>>>      single queue.
>>>>> * thread safety is made easier using a wrapper. For a MP ring, we can create
>>>>>    multiple pktdevs around it, and each thread will then be able to use their
>>>>>    own copy, with their own buffering etc.
>>>>>
>>>>> However, at this point, I'm just looking for general feedback on this as an
>>>>> approach. I think it's quite flexible - even more so than the earlier proposal
>>>>> we had. It's less proscriptive and doesn't make any demands on any other libs.
>>>>>
>>>>> Comments/thoughts welcome.
>>>> Any comments on this RFC before I see about investing further time in it to clean
>>>> it up a bit and submit as a non-RFC patchset for merge in 2.1?
>>> I would say there are 2 possible approaches for KNI and ring handling:
>>> 1/ You Bruce, Marc and Keith are advocating for a layer on top of ethdev,
>>> ring, KNI and possibly other devices, which uses mbuf. The set of functions
>>> is simpler than ethdev but the data structure is mbuf which is related to
>>> ethdev layer.
>>> 2/ Konstantin and Neil talked about keeping mbuf for ethdev layer and related
>>> libs only. Ring and KNI could have an ethdev API with a reduced set of
>>> implemented functions. Crypto devices could adopt a specific crypto API and
>>> an ethdev API at the same time.
>> I don't fully understand which APIs you meant by non-ethdev. This pktdev
>> wrapper proposal abstracts RX and TX functions only, and all of these are
>> using mbufs as the packet buffer abstraction right now anyway (ethdev).
>>
> He's referring to future device classes (like crypto devices), which ostensibly
> would make use of the pktdev API.  My argument (and I think Thomas') is that if
> a bit of hardware can be made to operate as a packet sending/receiving device,
> then its just as reasonable to use the existing ethdev api rather than some
> other restricted version of it (pktdev)
>
>> This approach does not preclude that different libraries expose other API
>> calls. In fact they will have to; setup the port/device ... It is just a
>> higher level API, so that you don't have to check the type of port in your
>> DPDK application I/O loop, minimizing user's code.
>>
> No argument there.  But if thats the case (and I agree that it is), an
> application will implicitly have to know what what type of device it is, because
> it (the application) will need to understand the specific API it is writing to.
>
>> Or were you in 2) thinking about creating a different "packet buffer"
>> abstraction, independent from the ethdev, and then map the different port
>> specifics (e.g. mbuf) to this new abstraction?
>>
> My argument was to just leave the ethdev api alone.  If a device class can be
> made to look like a packet forwarding device, then use the existing ethdev api
> to implement it.
>
>>> I feel it's cleaner, more generic and more maintainable to have drivers
>>> implementing one or several stable APIs instead of having some restricted
>>> wrappers to update.
>> This would be a separate library _on top_ of the existing APIs, and it has
>> the advantage to simplify the DPDK user's application code when an
>> application needs to deal with several types of port, as shown in the
>> example that Bruce provided in PATCH #2.
>>
> But thats already the purpose of the ethdev api.  Different types of
> hardware/software can be made to look like the same thing (an ethdev) from an
> application standpoint.  Adding this pktdev layer does nothing but that, add a
> layer.  If you want restricted functionality of an interface, thats ok, ethdev
> offers that ability.  unimplemented methods in a pmd cause the ethdev api to
> return EOPNOTSUP to the calling application, so the application knows when a
> given ethdev can't do some aspect of what an ethdev is.

Hi Neil,

Thanks for the clarifications. Now I understand the concern Thomas 
expressed. Using ethdev API (port-ids) was actually my first suggestion 
here:

http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/13545

And to be honest, what I was expecting when I was reading for the first 
time DPDK's APIs. It is indeed an option. However, if we take a look at 
the API:

http://www.dpdk.org/doc/api/rte__ethdev_8h.html

none of the API calls, except the burst RX/TX and, perhaps, the 
callbacks, would be used by devices other than NICs. It seems going a 
bit too far using it, but ofc possible.

In essence, rte_ether(rte_ethdev.h) right now has: i) NIC setup; general 
configuration, queue config, fdir, offloads, hw stuff like leds... ii) 
RX/TX routines and callbacks iii) Stats and queue stats iv) other utils 
for ethernet stuff (rte_ether.h)

i) is clearly HW specific, and does only apply to NICs/ASICs (e.g. 
FM10k) while ii) and iii) are things that could be abstracted beyond 
NICs, like KNI, rte_ring, crypto... (iv could be moved into some 
utils/protocol parsing libraries).

Perhaps these two groups could be split into two different libraries and 
then ii) and iii) together would be something like ~ rte_pktdev (stats 
are missing on the proposed patch), while i) would be rte_ether, or 
rte_nic if we think it is a better name.

In any case, and I think we all agree here, I just think that one way or 
another this should be abstracted so that it simplifies (and reduces) a 
bit the code of DPDK applications.

Marc

>
>> I don't see why this could limit us or make it less maintainable. Of course
>> this is an RFC patch; appropriate tests are missing (Bruce I can help you on
>> that)
>>
> It doesn't limit us, its just not a useful abstraction, because we already have
> the abilities it provides.
>
> Neil
>> Marc
>>
>>> Comments are welcome.
>>



More information about the dev mailing list