[dpdk-dev] [EXT] RE: [PATCH] examples/ipsec-secgw: fix dropping of initial IPsec pkts

Lukas Bartosik [C] lbartosik at marvell.com
Wed Apr 29 14:52:42 CEST 2020


Hi Konstantin,

Please see my answer below.

Thanks,
Lukasz

On 24.04.2020 14:35, Ananyev, Konstantin wrote:
>>>
>>> Hi Lukas,
>>>
>>> Apologies for delay with response.
>>>
>>>> Do you have any thoughts how both cases could be covered:
>>>> 1. Inline not applied to inbound IPsec pkts for short duration of time
>>>> after rte_eth_dev_start() but before sa_init() is executed (which creates SAs).
>>>> 2. SAs not surviving rte_eth_dev_start() on ixgbe driver.
>>>
>>> No, right now I don't have an idea how both can be satisfied.
>>> About 1) - why do you consider that as problem?
>>> As I understand when inline device is started by not properly programmed yet,
>>> these packets will be treated by device as non-ipsec ones (just plain-text
>>> packets).
>>> Then later in ipsec-secgw RX path they will probably be just dropped.
>>> So no harm here, no?
>>
>> [Anoob] Yes, when SAs are not yet created and device is started then HW treats IPsec packets as clear packets and does not apply inline to
>> them. The IPsec packets arrive at application level. This points at a serious hardware issue as a packet intended for IPsec has reached
>> application without being IPsec processed. Application would check if session is created and since the session is available at the time of
>> processing the packet (in the application), it would look like a hardware issue when it is just a false positive.
> 
> I still not sure why is that big deal, but ok, can we then just do
> it in a different way for poll vs event mode for now:
> for event mode do dev_start() after sa_init(),
> for poll mode leave things as it (dev_start(), then sa_init()).
> Would that address the issue?
> 

[Lukasz] We would prefer not to split behaviour of poll and event modes because of workarounds.
Currently sa_init() is being done after dev_start() a result of a workaround for ixgbe driver.
IMHO correct order is to create flows, SAs and as a last step do dev_start() when 
all necessary configuration is in place. We can wait for the fix in the ixgbe driver.

>>>
>>> About 2) - looking at ixgbe code, I think it can be changed to make ipsec flows to
>>> survive device start/stop (though most likely not in 20.05 timeframe).
>>> So I am not sure should we do it at all.
>>> Obviously for ixgbe (with 1K max SA) it is not a problem to save extra info about
>>> HW flows and restore them after stop/start.
>>> Though for modern devices that might have millions of flows maintaining such
>>> info in PMD might become a significant overhead.
>>> As generic question - how PMD should behave in such situation?
>>> Should it support flows survival through dev_start/dev_stop, or should it be app
>>> responsibility to reprogram flows after dev-start/dev_stop.
>>
>> [Anoob] I believe the spec talks about the behavior during start/stop and initialization.
>>
>> "Flow rules are not maintained between successive port initializations. An application exiting without releasing them and restarting must re-
>> create them from scratch."
>>
>> "PMDs, not applications, are responsible for maintaining flow rules configuration when stopping and restarting a port or performing other
>> actions which may affect them."
> 
> But it also says:
> - DPDK does not keep track of flow rules definitions or flow rule objects
>>> automatically.
>>>   Applications may keep track of the former and must keep track of the latter.
>>>   PMDs may also do it for internal needs, however this must not be relied on by
>>> applications.
> 
>>
>> From the above two, following is what I could conclude,
>> 1. Across initializations rte_flow rules are not expected to be retained.
>> 2. Across restarts (dev start/stop), PMDs have to ensure rte_flow rules survive.
> 
> Ok, I get your opinion, but I would like more input from the rest of the community.
> In particular from PMD/flow/security designers/maintainers/users.
> Should PMD be responsible for keeping/restoring all flows rules config accross
> dev_start/dev_stop?
> If yes, does anyone see any implications with that in terms of:
> - increased memory footprint
> - time dev_start would take
> - time flow_rule_add/del will take
> - possible storing of auth/crypto keys inside the PMD
>  
> BTW, I assume that octeontx2 does keep ipsec (all?) flow rules across dev_start/dev_stop?
> If so, does HW can keep it, or PMD restores it at start?
> Or you just rely on table shared between HW and SW?   
> 
>> I haven't looked at the ixgbe PMD yet, but technically start/stop should only be about whether the device is running. 
> 
> Not only.
> It does HW reset/init,  starts RX/TX queues, selects RX/TX functions,
> applies/restores settings to HW, etc. 
> 
>> I believe there are
>> other settings also which need to be retained between start/stop.
>>
>> From https://urldefense.proofpoint.com/v2/url?u=https-3A__doc.dpdk.org_api_rte-5F-5Fethdev-5F8h.html&d=DwIGaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=SchRHhE7GLCjEY4i2a1byjC_FpWgRLtq4-kLvKp3_84&m=psAUJcBSQZSsmUXoxKviYdNco-goWhM5fy7riLn9WZE&s=ACSxcC8gxiy_autYI-OoZ1JCFdP89jCgVC9XAjM9Hh4&e= 
>>
>> Please note that some configuration is not stored between calls to rte_eth_dev_stop()/rte_eth_dev_start(). The following configuration will
>> be retained:
>>
>> - MTU
>> - flow control settings
>> - receive mode configuration (promiscuous mode, all-multicast mode,
>>   hardware checksum mode, RSS/VMDQ settings etc.)
>> - VLAN filtering configuration
>> - default MAC address
>> - MAC addresses supplied to MAC address array
>> - flow director filtering mode (but not filtering rules)
>> - NIC queue statistics mappings
>>
>> Flow director refers to rte_flow, right?
> 
> Yes, and it says mode, not rules.
>  
>>> I looked through PG for rte_flow, and it seems a bit conversional here (at least
>>> for me):
>>> https://urldefense.proofpoint.com/v2/url?u=https-
>>> 3A__doc.dpdk.org_guides_prog-5Fguide_rte-
>>> 5Fflow.html&d=DwIGaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=jPfB8rwwviRSxyLWs
>>> 2n6B-
>>> WYLn1v9SyTMrT5EQqh2TU&m=pUa6XHzwVA2IP7UEe5b98hosHmHjxlLOmn3IxH
>>> hZQAA&s=2BM4iySNyS0svzG26B0PiG8hQoAT2clAtkxiJrytdaI&e=  (11.7):
>>> Form one side:
>>> - DPDK does not keep track of flow rules definitions or flow rule objects
>>> automatically.
>>>   Applications may keep track of the former and must keep track of the latter.
>>>   PMDs may also do it for internal needs, however this must not be relied on by
>>> applications.
>>> But a bit below:
>>> - PMDs, not applications, are responsible for maintaining flow rules
>>> configuration
>>>    when stopping and restarting a port or performing other actions which may
>>> affect them.
>>>   They can only be destroyed explicitly by applications.
>>>
>>> Would like to know what rte_flow maintainers and other interested parties think
>>> about it?
>>
>> [Anoob] I do agree that the above two points can be misleading. But it actually describes the usage of rte_flow API.
>>
>> 1. "DPDK does not keep track of flow rules definitions or flow rule objects automatically. Applications may keep track of the former and
>> must keep track of the latter."
>> DPDK rte_flow spec doesn't provide an API to check if a rule is already created. Application is configuring hardware resources using rte_flow
>> and it has to keep track of what is configured.
>>
>> 2. "PMDs, not applications, are responsible for maintaining flow rules configuration when stopping and restarting a port or performing
>> other actions which may affect them."
>> In this case, we are talking about the state of the hardware. Once the rules are configured, it should survive across stop-start. Application is
>> not required to destroy and create rte_flows in such case.
>>
>> In first case, we are talking about rte_flow objects & rules (like session in case of rte_security) and in the second we are talking about
>> hardware state.
>>
>> I think this patch can be delayed till ixgbe can be fixed. Having it in 20.08 timeframe should be fine.
>>
>>>
>>> Konstantin
>>>
>>>>
>>>> Thanks,
>>>> Lukasz
>>>>
>>>> On 16.04.2020 14:28, Lukas Bartosik [C] wrote:
>>>>> Hi Konstantin,
>>>>>
>>>>> Please see my answer below.
>>>>>
>>>>> Thanks,
>>>>> Lukasz
>>>>>
>>>>> On 16.04.2020 01:47, Ananyev, Konstantin wrote:
>>>>>> External Email
>>>>>>
>>>>>> -------------------------------------------------------------------
>>>>>> ---
>>>>>>
>>>>>>
>>>>>> Hi Lukasz,
>>>>>>
>>>>>>> Hi Konstantin,
>>>>>>>
>>>>>>> In this patch I moved the sa_init() before rte_eth_dev_start() in
>>>>>>> order to avoid dropping of IPsec pkts when a traffic flows and the ipsec-
>>> secgw application is started.
>>>>>>>
>>>>>>> However I remember that during review of event mode patches you
>>>>>>> mentioned that moving sa_init() before rte_eth_dev_start() is an
>>>>>>> issue for one of the Intel drivers.
>>>>>>
>>>>>> Yes, I think so.
>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__mails.dpdk.org_
>>>>>> archives_dev_2019-
>>>>
>>> 2DDecember_153908.html&d=DwIGaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=SchR
>>> HhE7GLC
>>>> jEY4i2a1byjC_FpWgRLtq4-
>>>> kLvKp3_84&m=w3xh94Ox4xIhabfE-
>>> nD2VbEWbh2JTmiscVMb6pJZcYo&s=9rDtRPGK2QBD
>>>> AcY8VQf0HQzXINtQzucwIxU7DB2ND5s&e=
>>>>>> Moving that piece of code (dev_start) after sa_init() breaks ixgbe inline-
>>> crypto support.
>>>>>> As I understand, because configured ipsec flows don't persist dev_start().
>>>>>> At least for ixgbe PMD.
>>>>>> Any reason why to move that code at all?
>>>>>>
>>>>>
>>>>> [Lukasz] We're observing issue in inline mode. When traffic flows
>>>>> and ipsec-secgw application is started then for short period of time
>>>>> inline is not applied by HW and IPsec packets reach the application.
>>>>> This is because
>>>>> sa_init() (which creates security associations SAs for HW) is executed after
>>> rte_eth_dev_start().
>>>>> That's the reason I moved the code. And that movement fixes the
>>>>> issue because now SAs are already created when eth ports are started.
>>>>>
>>>>> Would it be possible to fix the ixgbe so that SAs would survive
>>> rte_eth_dev_start() ?
>>>>> Do you have any other idea how we could cover both cases ?
>>>>>
>>>>>>  > Is this still the case ?
>>>>>>
>>>>>> AFAIK, yes.
>>>>>> Thanks for bringing it to attention.
>>>>>> Konstantin
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Lukasz
>>>>>>>
>>>>>>> On 08.04.2020 13:32, Lukasz Bartosik wrote:
>>>>>>>> In inline event mode when traffic flows and the ipsec-secgw app
>>>>>>>> is started then for short period of time IPsec packets arrive at
>>>>>>>> application without being decrypted and are dropped by the
>>>>>>>> application. This happens because eth ports are started before
>>>>>>>> creation of inline sessions and IPsec flows. This fix rearranges
>>>>>>>> the code in such a way that eth ports are always started after
>>>>>>>> creation of inline sessions and IPsec flows.
>>>>>>>>
>>>>>>>> Change-Id: Ifddc446082fb2897f81559517f90e1ee603e13f3
>>>>>>>> Signed-off-by: Lukasz Bartosik <lbartosik at marvell.com>
>>>>>>>> ---
>>>>>>>>  examples/ipsec-secgw/event_helper.c | 26
>>>>>>>> --------------------------  examples/ipsec-secgw/ipsec-secgw.c  |
>>>>>>>> 26 +++++++++++++-------------
>>>>>>>>  2 files changed, 13 insertions(+), 39 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/examples/ipsec-secgw/event_helper.c
>>>>>>>> b/examples/ipsec-secgw/event_helper.c
>>>>>>>> index 076f1f2..da861e4 100644
>>>>>>>> --- a/examples/ipsec-secgw/event_helper.c
>>>>>>>> +++ b/examples/ipsec-secgw/event_helper.c
>>>>>>>> @@ -1526,7 +1526,6 @@ int32_t
>>>>>>>>  eh_devs_init(struct eh_conf *conf)  {
>>>>>>>>  	struct eventmode_conf *em_conf;
>>>>>>>> -	uint16_t port_id;
>>>>>>>>  	int ret;
>>>>>>>>
>>>>>>>>  	if (conf == NULL) {
>>>>>>>> @@ -1558,16 +1557,6 @@ eh_devs_init(struct eh_conf *conf)
>>>>>>>>  	/* Display the current configuration */
>>>>>>>>  	eh_display_conf(conf);
>>>>>>>>
>>>>>>>> -	/* Stop eth devices before setting up adapter */
>>>>>>>> -	RTE_ETH_FOREACH_DEV(port_id) {
>>>>>>>> -
>>>>>>>> -		/* Use only the ports enabled */
>>>>>>>> -		if ((conf->eth_portmask & (1 << port_id)) == 0)
>>>>>>>> -			continue;
>>>>>>>> -
>>>>>>>> -		rte_eth_dev_stop(port_id);
>>>>>>>> -	}
>>>>>>>> -
>>>>>>>>  	/* Setup eventdev */
>>>>>>>>  	ret = eh_initialize_eventdev(em_conf);
>>>>>>>>  	if (ret < 0) {
>>>>>>>> @@ -1589,21 +1578,6 @@ eh_devs_init(struct eh_conf *conf)
>>>>>>>>  		return ret;
>>>>>>>>  	}
>>>>>>>>
>>>>>>>> -	/* Start eth devices after setting up adapter */
>>>>>>>> -	RTE_ETH_FOREACH_DEV(port_id) {
>>>>>>>> -
>>>>>>>> -		/* Use only the ports enabled */
>>>>>>>> -		if ((conf->eth_portmask & (1 << port_id)) == 0)
>>>>>>>> -			continue;
>>>>>>>> -
>>>>>>>> -		ret = rte_eth_dev_start(port_id);
>>>>>>>> -		if (ret < 0) {
>>>>>>>> -			EH_LOG_ERR("Failed to start eth dev %d, %d",
>>>>>>>> -				   port_id, ret);
>>>>>>>> -			return ret;
>>>>>>>> -		}
>>>>>>>> -	}
>>>>>>>> -
>>>>>>>>  	return 0;
>>>>>>>>  }
>>>>>>>>
>>>>>>>> diff --git a/examples/ipsec-secgw/ipsec-secgw.c
>>>>>>>> b/examples/ipsec-secgw/ipsec-secgw.c
>>>>>>>> index 5fde4f7..e03bd89 100644
>>>>>>>> --- a/examples/ipsec-secgw/ipsec-secgw.c
>>>>>>>> +++ b/examples/ipsec-secgw/ipsec-secgw.c
>>>>>>>> @@ -2829,6 +2829,19 @@ main(int32_t argc, char **argv)
>>>>>>>>  	if (ret < 0)
>>>>>>>>  		rte_exit(EXIT_FAILURE, "eh_devs_init failed, err=%d\n", ret);
>>>>>>>>
>>>>>>>> +	/* Replicate each context per socket */
>>>>>>>> +	for (i = 0; i < NB_SOCKETS && i < rte_socket_count(); i++) {
>>>>>>>> +		socket_id = rte_socket_id_by_idx(i);
>>>>>>>> +		if ((socket_ctx[socket_id].mbuf_pool != NULL) &&
>>>>>>>> +			(socket_ctx[socket_id].sa_in == NULL) &&
>>>>>>>> +			(socket_ctx[socket_id].sa_out == NULL)) {
>>>>>>>> +			sa_init(&socket_ctx[socket_id], socket_id);
>>>>>>>> +			sp4_init(&socket_ctx[socket_id], socket_id);
>>>>>>>> +			sp6_init(&socket_ctx[socket_id], socket_id);
>>>>>>>> +			rt_init(&socket_ctx[socket_id], socket_id);
>>>>>>>> +		}
>>>>>>>> +	}
>>>>>>>> +
>>>>>>>>  	/* start ports */
>>>>>>>>  	RTE_ETH_FOREACH_DEV(portid) {
>>>>>>>>  		if ((enabled_port_mask & (1 << portid)) == 0) @@ -2866,19
>>>>>>>> +2879,6 @@ main(int32_t argc, char **argv)
>>>>>>>>  			rte_exit(EXIT_FAILURE, "failed at reassemble init");
>>>>>>>>  	}
>>>>>>>>
>>>>>>>> -	/* Replicate each context per socket */
>>>>>>>> -	for (i = 0; i < NB_SOCKETS && i < rte_socket_count(); i++) {
>>>>>>>> -		socket_id = rte_socket_id_by_idx(i);
>>>>>>>> -		if ((socket_ctx[socket_id].mbuf_pool != NULL) &&
>>>>>>>> -			(socket_ctx[socket_id].sa_in == NULL) &&
>>>>>>>> -			(socket_ctx[socket_id].sa_out == NULL)) {
>>>>>>>> -			sa_init(&socket_ctx[socket_id], socket_id);
>>>>>>>> -			sp4_init(&socket_ctx[socket_id], socket_id);
>>>>>>>> -			sp6_init(&socket_ctx[socket_id], socket_id);
>>>>>>>> -			rt_init(&socket_ctx[socket_id], socket_id);
>>>>>>>> -		}
>>>>>>>> -	}
>>>>>>>> -
>>>>>>>>  	check_all_ports_link_status(enabled_port_mask);
>>>>>>>>
>>>>>>>>  	/* launch per-lcore init on every lcore */
>>>>>>>>


More information about the dev mailing list