[PATCH v1 1/2] net/memif: add a Rx fast path

Ferruh Yigit ferruh.yigit at amd.com
Thu May 19 18:38:31 CEST 2022


On 5/19/2022 4:09 PM, Joyce Kong wrote:
> [CAUTION: External Email]
> 
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit at xilinx.com>
>> Sent: Thursday, May 19, 2022 1:06 AM
>> To: Joyce Kong <Joyce.Kong at arm.com>; Jakub Grajciar <jgrajcia at cisco.com>
>> Cc: Ruifeng Wang <Ruifeng.Wang at arm.com>; dev at dpdk.org; nd
>> <nd at arm.com>
>> Subject: Re: [PATCH v1 1/2] net/memif: add a Rx fast path
>>
>> On 5/17/2022 11:51 AM, Joyce Kong wrote:
>>> For memif non-zero-copy mode, there is a branch to compare
>>> the mbuf and memif buffer size during memory copying. Add
>>> a fast memory copy path by removing this branch with mbuf
>>> and memif buffer size defined at compile time. The removal
>>> of the branch leads to considerable performance uplift.
>>>
>>> When memif <= buffer size, Rx chooses the fast memcpy path,
>>> otherwise it would choose the original path.
>>>
>>> Test with 1p1q on Ampere Altra AArch64 server,
>>> --------------------------------------------
>>>     buf size  | memif <= mbuf | memif > mbuf |
>>> --------------------------------------------
>>> non-zc gain |     4.30%     |    -0.52%    |
>>> --------------------------------------------
>>>      zc gain  |     2.46%     |     0.70%    |
>>> --------------------------------------------
>>>
>>> Test with 1p1q on Cascade Lake Xeon X86server,
>>> -------------------------------------------
>>>     buf size  | memif <= mbuf | memif > mbuf |
>>> -------------------------------------------
>>> non-zc gain |     2.13%     |    -1.40%    |
>>> -------------------------------------------
>>>      zc gain  |     0.18%     |     0.48%    |
>>> -------------------------------------------
>>>
>>> Signed-off-by: Joyce Kong <joyce.kong at arm.com>
>>
>> <...>
>>
>>> +   } else {
>>> +           while (n_slots && n_rx_pkts < nb_pkts) {
>>> +                   mbuf_head = rte_pktmbuf_alloc(mq->mempool);
>>> +                   if (unlikely(mbuf_head == NULL))
>>> +                           goto no_free_bufs;
>>> +                   mbuf = mbuf_head;
>>> +                   mbuf->port = mq->in_port;
>>> +
>>> +next_slot2:
>>> +                   s0 = cur_slot & mask;
>>> +                   d0 = &ring->desc[s0];
>>>
>>> -                   rte_memcpy(rte_pktmbuf_mtod_offset(mbuf, void *,
>>> -                                                      dst_off),
>>> -                           (uint8_t *)memif_get_buffer(proc_private, d0)
>> +
>>> -                           src_off, cp_len);
>>> +                   src_len = d0->length;
>>> +                   dst_off = 0;
>>> +                   src_off = 0;
>>
>> Hi Joyce, Jakub,
>>
>> Something doesn't look right in the original code (not in this patch),
>> can you please help me check if I am missing something?
>>
>> For the memif buffer segmented case, first buffer will be copied to
>> mbuf, 'dst_off' increased and jump back to process next memif segment:
>>
>>    + d0
>>    |
>>    v
>> +++  +-+
>> |a+->+b|
>> +-+  +-+
>>
>> +---+
>> |a  |
>> +-+-+
>>     ^
>>     |
>>     + dst_off
>>
>> "
>>       if (d0->flags & MEMIF_DESC_FLAG_NEXT)
>>            goto next_slot;
>> "
>>
>> But here 'dst_off' set back to '0', wont this cause next memif buffer
>> segment to write to beginning of mbuf overwriting previous data?
>>
>> Thanks,
>> Ferruh
> 
> Hi Ferruh,
> 
> Agree with you here, and sorry I didn’t notice it before. Perhaps moving
> 'det_off = 0' to the line above 'next_slot' would solve the overwriting?
> 

Yes, I think this solves the issue.


And I wonder why this is not caught by testing. @Jakub, is the segmented 
memif buffers not a common use case?

I did able to reproduce the issue as following (and confirm suggested 
change fixes it):

server
./build/app/dpdk-testpmd --proc-type=primary --file-prefix=pmd1 
--vdev=net_memif0,role=server,bsize=32 -- -i --txpkts=512
 > set fwd txonly
 > start

client
./build/app/dpdk-testpmd --proc-type=primary --file-prefix=pmd2 
--vdev=net_memif1,bsize=32 -- -i
 > set fwd rxonly
 > set verbose 3
 > start

'client' will display packets info wrong, it will be all '0'. Also it is 
possible to capture packets in client and confirm.


More information about the dev mailing list