[PATCH v1 1/2] net/memif: add a Rx fast path
Ferruh Yigit
ferruh.yigit at amd.com
Thu May 19 18:38:31 CEST 2022
On 5/19/2022 4:09 PM, Joyce Kong wrote:
> [CAUTION: External Email]
>
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit at xilinx.com>
>> Sent: Thursday, May 19, 2022 1:06 AM
>> To: Joyce Kong <Joyce.Kong at arm.com>; Jakub Grajciar <jgrajcia at cisco.com>
>> Cc: Ruifeng Wang <Ruifeng.Wang at arm.com>; dev at dpdk.org; nd
>> <nd at arm.com>
>> Subject: Re: [PATCH v1 1/2] net/memif: add a Rx fast path
>>
>> On 5/17/2022 11:51 AM, Joyce Kong wrote:
>>> For memif non-zero-copy mode, there is a branch to compare
>>> the mbuf and memif buffer size during memory copying. Add
>>> a fast memory copy path by removing this branch with mbuf
>>> and memif buffer size defined at compile time. The removal
>>> of the branch leads to considerable performance uplift.
>>>
>>> When memif <= buffer size, Rx chooses the fast memcpy path,
>>> otherwise it would choose the original path.
>>>
>>> Test with 1p1q on Ampere Altra AArch64 server,
>>> --------------------------------------------
>>> buf size | memif <= mbuf | memif > mbuf |
>>> --------------------------------------------
>>> non-zc gain | 4.30% | -0.52% |
>>> --------------------------------------------
>>> zc gain | 2.46% | 0.70% |
>>> --------------------------------------------
>>>
>>> Test with 1p1q on Cascade Lake Xeon X86server,
>>> -------------------------------------------
>>> buf size | memif <= mbuf | memif > mbuf |
>>> -------------------------------------------
>>> non-zc gain | 2.13% | -1.40% |
>>> -------------------------------------------
>>> zc gain | 0.18% | 0.48% |
>>> -------------------------------------------
>>>
>>> Signed-off-by: Joyce Kong <joyce.kong at arm.com>
>>
>> <...>
>>
>>> + } else {
>>> + while (n_slots && n_rx_pkts < nb_pkts) {
>>> + mbuf_head = rte_pktmbuf_alloc(mq->mempool);
>>> + if (unlikely(mbuf_head == NULL))
>>> + goto no_free_bufs;
>>> + mbuf = mbuf_head;
>>> + mbuf->port = mq->in_port;
>>> +
>>> +next_slot2:
>>> + s0 = cur_slot & mask;
>>> + d0 = &ring->desc[s0];
>>>
>>> - rte_memcpy(rte_pktmbuf_mtod_offset(mbuf, void *,
>>> - dst_off),
>>> - (uint8_t *)memif_get_buffer(proc_private, d0)
>> +
>>> - src_off, cp_len);
>>> + src_len = d0->length;
>>> + dst_off = 0;
>>> + src_off = 0;
>>
>> Hi Joyce, Jakub,
>>
>> Something doesn't look right in the original code (not in this patch),
>> can you please help me check if I am missing something?
>>
>> For the memif buffer segmented case, first buffer will be copied to
>> mbuf, 'dst_off' increased and jump back to process next memif segment:
>>
>> + d0
>> |
>> v
>> +++ +-+
>> |a+->+b|
>> +-+ +-+
>>
>> +---+
>> |a |
>> +-+-+
>> ^
>> |
>> + dst_off
>>
>> "
>> if (d0->flags & MEMIF_DESC_FLAG_NEXT)
>> goto next_slot;
>> "
>>
>> But here 'dst_off' set back to '0', wont this cause next memif buffer
>> segment to write to beginning of mbuf overwriting previous data?
>>
>> Thanks,
>> Ferruh
>
> Hi Ferruh,
>
> Agree with you here, and sorry I didn’t notice it before. Perhaps moving
> 'det_off = 0' to the line above 'next_slot' would solve the overwriting?
>
Yes, I think this solves the issue.
And I wonder why this is not caught by testing. @Jakub, is the segmented
memif buffers not a common use case?
I did able to reproduce the issue as following (and confirm suggested
change fixes it):
server
./build/app/dpdk-testpmd --proc-type=primary --file-prefix=pmd1
--vdev=net_memif0,role=server,bsize=32 -- -i --txpkts=512
> set fwd txonly
> start
client
./build/app/dpdk-testpmd --proc-type=primary --file-prefix=pmd2
--vdev=net_memif1,bsize=32 -- -i
> set fwd rxonly
> set verbose 3
> start
'client' will display packets info wrong, it will be all '0'. Also it is
possible to capture packets in client and confirm.
More information about the dev
mailing list