[dpdk-users] KNI Threads/Cores

Cliff Burdick shaklee3 at gmail.com
Wed Jun 8 21:31:17 CEST 2016

Thanks Matt! I will try that. It seems very clean.

On Wed, Jun 8, 2016 at 9:45 AM, Matt Laswell <laswell at infinite.io> wrote:

> Hey Cliff,
> I have a similar use case in my application.  If you're willing to
> dedicate an lcore per socket, another way to approach what you're
> describing is to create a KNI interface thread that talks to the other
> cores via message rings.  That is, the cores that are interacting with the
> NIC read a bunch of packets, determine if any of them need to go to KNI
> and, if so, enqueue them using rte_ring_enqueue().  They also do a periodic
> rte_ring_dequeue() on another queue to accept back any packets that come
> back from KNI.
> The KNI interface process, meanwhile, just loops along, taking packets in
> from the NIC interface threads via rte_ring_dequeue() and sending them to
> KNI, and taking packets from KNI and returning them to the NIC interface
> threads via rte_ring_enqueue().
> I've found that this sort of scheme works well, and is reasonably clean
> architecturally.  Also, I found that calls into KNI can at times be very
> slow.  In my application, I would periodically see KNI calls take 50-100K
> cycles, which can cause congestion if you're handling large volumes of
> traffic.  Letting a non-critical thread handle this interface was a big win
> for me.
> This leaves the kernel side processing out, of course.  But if the traffic
> going to the kernel is lightweight, you likely don't need a dedicated core
> for the kernel-side RX and TX work.
> --
> Matt Laswell
> Principal Software Engineer
> infinite io
> On Wed, Jun 8, 2016 at 11:30 AM, Cliff Burdick <shaklee3 at gmail.com> wrote:
>> Hi, I have an application with two sockets where each core I'm planning to
>> transmit and receive a fairly large amount of traffic per core. Each core
>> right now handles a single queue of either TX or RX of a given port.
>> Across
>> all the cores, I may be processing up to 12 ports. I also need to handle
>> things like ARP and ping, so I'm going to add in the KNI driver to handle
>> that. Since the amount of traffic I'm expecting that I'll need to forward
>> to Linux is very small, it seems like I should be able to dedicate one
>> lcore per socket to handle this functionality and have the dataplane cores
>> pass the traffic off to this core using rte_kni_tx_burst().
>> My question is, first of all, is this possible? It seems like I can
>> configure the KNI driver to start in "single thread" mode. From that
>> point,
>> I want to initialize one KNI device for each port, and have each kernel
>> lcore on each processor handle that traffic. I believe if I call
>> rte_kni_alloc with core_id set to the kernel lcore for each device, then
>> in
>> the end I'll have something like 6 KNI devices on socket one being handled
>> by lcore 0, and 6 KNI devices on socket 2 being handled by lcore 31 as an
>> example. Then my threads that are handling the dataplane tx/rx can simply
>> be passed a pointer to their respective rte_kni device. Does this sound
>> correct?
>> Also, the sample says the core affinity needs to be set using taskset. Is
>> that already taken care of with conf.core_id in rte_kni_alloc or do I
>> still
>> need to set it?
>> Thanks

More information about the users mailing list