[dpdk-dev] [PATCH v4 1/9] examples/ipsec-secgw: avoid to request unused TX offloads

Akhil Goyal akhil.goyal at nxp.com
Mon Dec 24 11:54:25 CET 2018



On 12/24/2018 3:49 PM, Ananyev, Konstantin wrote:
>
>>>> On 12/14/2018 10:10 PM, Konstantin Ananyev wrote:
>>>>> ipsec-secgw always enables TX offloads
>>>>> (DEV_TX_OFFLOAD_MULTI_SEGS, DEV_TX_OFFLOAD_SECURITY),
>>>>> even when they are not requested by the config.
>>>>> That causes many PMD to choose full-featured TX function,
>>>>> which in many cases is much slower then one without offloads.
>>>>> That patch adds checks to enabled extra HW offloads, only when
>>>>> they were requested.
>>>>> Plus it enables DEV_TX_OFFLOAD_IPV4_CKSUM,
>>>>> only when other HW TX ofloads are going to be enabled.
>>>>> Otherwise SW version of ip cksum calculation is used.
>>>>> That allows to use vector TX function, when inline-ipsec is not
>>>>> requested.
>>>>>
>>>>> Signed-off-by: Remy Horton <remy.horton at intel.com>
>>>>> Signed-off-by: Konstantin Ananyev <konstantin.ananyev at intel.com>
>>>>> Acked-by: Radu Nicolau <radu.nicolau at intel.com>
>>>>> ---
>>>>>     examples/ipsec-secgw/ipsec-secgw.c | 44 +++++++++++++++--------
>>>>>     examples/ipsec-secgw/ipsec.h       |  6 ++++
>>>>>     examples/ipsec-secgw/sa.c          | 56 ++++++++++++++++++++++++++++++
>>>>>     3 files changed, 91 insertions(+), 15 deletions(-)
>>>>>
>>>>> diff --git a/examples/ipsec-secgw/ipsec-secgw.c b/examples/ipsec-secgw/ipsec-secgw.c
>>>>> index 1bc0b5b50..cfc2b05e5 100644
>>>>> --- a/examples/ipsec-secgw/ipsec-secgw.c
>>>>> +++ b/examples/ipsec-secgw/ipsec-secgw.c
>>>>> @@ -208,8 +208,6 @@ static struct rte_eth_conf port_conf = {
>>>>>     	},
>>>>>     	.txmode = {
>>>>>     		.mq_mode = ETH_MQ_TX_NONE,
>>>>> -		.offloads = (DEV_TX_OFFLOAD_IPV4_CKSUM |
>>>>> -			     DEV_TX_OFFLOAD_MULTI_SEGS),
>>>> I believe this is disabling checksum offload for all cases and then
>>>> enabling only for inline crypto and inline proto.
>>> Yes.
>>>
>>>> This is breaking lookaside proto and lookaside none cases. Please
>>>> correct me if I am wrong.
>>> Why breaking?
>> reduction in performance is kind of breaking the code.
> I didn’t observe any performance drop with that patch.
> In fact there was a tiny improvement (see below).
> Did you see any regression with this patch on your HW?
NXP hardware are low -end to mid end devices and we are always 
bottleneck by core cycles.
So we would like to have as much offloads to HW as possible.
>
>>> For cases when HW cksum offload is disabled, IPv4 cksum calculation
>>> will be done in SW, see below:
>>> prepare_tx_pkt(...)
>>> {
>>>      ...
>>>       +
>>>       +		/* calculate IPv4 cksum in SW */
>>>       +		if ((pkt->ol_flags & PKT_TX_IP_CKSUM) == 0)
>>>       +			ip->ip_sum = rte_ipv4_cksum((struct ipv4_hdr *)ip);
>>>
>>>
>>> We tested lookaside-none case quite extensively - all works well,
>>> in fact on Intel NICs it became even a bit faster because of that change
>>> (though not much).
>> yes, it may work well on one hardware, but may not perform good in other
>> hardware where cores are limited.
> Could you elaborate a bit more what do you mean by 'cores are limited' here?
we have single core devices as well on which we run ipsec-secgw.
> Do you mean that for some low end cpus calculating IPv4 cksum in SW is too expensive?
yes, limited by core cycles and not by HW
> Note that prepare_tx_pkts() and friends read/write L2/L3 packet headers anyway -
> so IPv4 header will be in L1 cache already.
Agreed, but still it will consume some cycles which are more than that 
of HW.
>
>>> Disabling HW offloads when they are not really required has 2 benefits:
>>>    1) allows app to be run on NICs without HW offloads support.
>>>    2) allows dev_configure() for TX path to select simple/vector TX functions
>>>        which for many NICs are significantly faster.
>>>
>>> Konstantin
>>>
>>>> So a NACK for this if my understanding is correct.
>>>>



More information about the dev mailing list