[dpdk-dev] [RFC] ipsec: new library for IPsec data-path processing

Ananyev, Konstantin konstantin.ananyev at intel.com
Wed Oct 3 01:56:23 CEST 2018


Hi Jerin,

> > > > > > >
> > > > > > > Anyway, let's pretend we found some smart way to distribute inbound packets for the same SA to multiple HW queues/CPU
> > > > > cores.
> > > > > > > To make ipsec processing for such case to work correctly just atomicity on check/update segn/replay_window is not enough.
> > > > > > > I think it would require some extra synchronization:
> > > > > > > make sure that we do final packet processing (seq check/update) at the same order as we received the packets
> > > > > > > (packets entered ipsec processing).
> > > > > > > I don't really like to introduce such heavy mechanisms on SA level,  after all it supposed to be light and simple.
> > > > > > > Though we plan CTX level API to support such scenario.
> > > > > > > What I think would be useful addition for SA level API - have an ability to do one update seqn/replay_window and multiple checks
> > > > > concurrently.
> > > > > > >
> > > > > > > > In case of ingress also, the same problem exists. We will not be able to use RSS and spread the traffic to multiple cores.
> > > > > Considering
> > > > > > > > IPsec being CPU intensive, this would limit the net output of the chip.
> > > > > > > That's true - but from other side implementation can offload heavy part
> > > > > > > (encrypt/decrypt, auth) to special HW (cryptodev).
> > > > > > > In that case single core might be enough for SA and extra synchronization would just slowdown things.
> > > > > > > That's why I think it should be configurable  what behavior (ST or MT) to use.
> > > > > > I do agree that these are the issues that we need to address to make the
> > > > > > library MT safe. Whether the extra synchronization would slow down things is
> > > > > > a very subjective question and will heavily depend on the platform. The
> > > > > > library should have enough provisions to be able to support MT without
> > > > > > causing overheads to ST. Right now, the library assumes ST.
> > > > >
> > > > >
> > > > > I agree with Anoob here.
> > > > >
> > > > > I have two concerns with librte_ipsec as a separate library
> > > > >
> > > > > 1) There is an overlap with rte_security and new proposed library.
> > > >
> > > > I don't think there really is an overlap.
> > >
> > > As mentioned in your other email. IMO, There is an overlap as
> > > RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL can support almost everything
> > > in HW or HW + SW if some PMD wishes to do so.
> > >
> > > Answering some of the questions, you have asked in other thread based on
> > > my understanding.
> > >
> > > Regarding RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL support,
> > > Marvell/Cavium CPT hardware on next generation HW(Planning to upstream
> > > around v19.02) can support RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL and
> > > Anoob already pushed the application changes in ipsec-gw.
> >
> > Ok good to know.
> >
> > >
> > > In our understanding of HW/SW roles/responsibilities for that type of
> > > devices are:
> > >
> > > INLINE_PROTOCOL
> > > ----------------
> > > In control path, security session is created with the given SA and
> > > rte_flow configuration etc.
> > >
> > > For outbound traffic, the application will have to do SA lookup and
> > > identify the security action (inline/look aside crypto/protocol). For
> > > packets identified for inline protocol processing, the application would
> > > submit as plain packets to the ethernet device and the security capable
> > > ethernet device would perform IPSec and send out the packet. For PMDs
> > > which would need extra metadata (capability flag), set_pkt_metadata
> > > function pointer would be called (from application).
> > > This can be used to set some per packet field to identify the security session to be used to
> > > process the packet.
> >
> > Yes, as I can see, that's what ipsec-gw is doing right now and it wouldn't be
> > a problem to do the same in ipsec lib.
> >
> > > Sequence number updation will be done by the PMD.
> >
> > Ok, so for INLINE_PROTOCOL upper layer wouldn't need to keep track for SQN values at all?
> > You don’t' consider a possibility that by some reason that SA would need to
> > be moved from device that support INLINE_PROTOCOL to the device that doesn't?
> 
> For INLINE_PROTOCOL, the application won't have any control over such
> per packet fields. As for moving the SA to a different device, right now
> rte_security spec doesn't allow that. May be we should fix the spec to
> allow multiple devices to share the same security session. That way, if
> there is error in the inline processing, application will be able to
> submit the packet to LOOKASIDE_PROTOCOL crypto device (sharing the
> session) and get the packet processed.
> 

Yep, that my thought too.
If we want to support such scenarios with lookaside-proto and inline-proto devices,
then rte_security need to be changed/extended.

> 
> >
> > > For inbound traffic, the packets for IPSec would be identified by using
> > > rte_flow (hardware accelerated packet filtering). For the packets
> > > identified for inline offload (SECURITY action), hardware would perform
> > > the processing. For inline protocol processed IPSec packets, PMD would
> > > set “user data” so that application can get the details of the security
> > > processing done on the packet. Once the plain packet (after IPSec
> > > processing) is received, a selector check need to be performed to make
> > > sure we have a valid packet after IPSec processing. The user data is used
> > > for that. Anti-replay check is handled by the PMD. The PMD would raise
> > > an eth event in case of sequence number expiry or any SA expiry.
> >
> > Few questions here:
> > 1) if I understand things right - to specify that it was an IPsec packet -
> > PKT_RX_SEC_OFFLOAD will be set in mbuf ol_flags?
> > 2) Basically 'userdata' will contain just a user provided at rte_security_session_create pointer
> > (most likely pointer to the SA, as it is done right now in ipsec-secgw), correct?
> 
> Yes to 1 & 2.
> 
> 
> > 3) in current rte_security API si there a way to get/set replay window size, etc?
> 
> Not right now. But Akhil mentioned that it will be added soon.
> 
> 
> > 4)   Same question as for TX: you don't plan to support fallback to other type of devices/SW?
> > I.E. HW was not able to process ipsec packet by some reason (let say fragmented packet)
> > and now it is SW responsibility to do so?
> > The reason I am asking for that - it seems right now there is no defined way
> > to share SQN related information between HW/PMD and upper layer SW.
> > Is that ok, or would we need such capability?
> > If we would, and upper layer SW would need to keep track on SQN anyway,
> > then there is probably no point to do same thing in PMD itelf?
> > In that case PMD just need to provide SQN information to the upper layer
> > (probably one easy way to do it - reuse rte_,buf.seqn for that purpose,
> > though for that will probably need make it 64-bit long).
> 
> The spec doesn't allow doing IPsec partially on HW & SW. The way spec is
> written (and implemented in ipsec-secgw) to allow one kind of
> RTE_SECURITY_ACTION_TYPE for one SA. If HW is not able to process packet
> received on INLINE_PROTOCOL SA, then it is treated as error. Handling
> fragmentation is a very valid scenario. We will have to edit the spec if
> we need to handle this scenario.
> 
> >
> > >
> > >
> > > LOOKASIDE_PROTOCOL
> > > ------------------
> > > In control path, security session is created with the given SA.
> > >
> > > Enqueue/dequeue is similar to what is done for regular crypto
> > > (RTE_SECURITY_ACTION_TYPE_NONE) but all the protocol related processing
> > > would be offloaded. Application will need to do SA lookup and identify
> > > the processing to be done (both in case of outbound & inbound), and
> > > submit packet to crypto device. Application need not do any IPSec
> > > related transformations other than the lookup. Anti-replay need to be
> > > handled in the PMD (the spec says the device “may be handled” do anti-replay check,
> > > but a complete protocol offload would need anti-replay check also).
> >
> > Same question here - wouldn't there be a situations when HW/PMD would need to
> > share SQN information with upper layer?
> > Let say if upper layer SW would need to do load balancing between crypto-devices
> > with LOOKASIDE_PROTOCOL and without?
> 
> Same answer as above. ACTION is tied to security session which is tied
> to SA. SQN etc is internal to the session and so load balancing between
> crypto-devices is not supported.
> 
> >
> > >
> > >
> > > > rte_security is a 'framework for management and provisioning of security protocol operations offloaded to hardware based devices'.
> > > > While rte_ipsec is aimed to be a library for IPsec data-path processing.
> > > > There is no plans for rte_ipsec to 'obsolete' rte_security.
> > > > Quite opposite rte_ipsec supposed to work with both rte_cryptodev and rte_security APIs (devices).
> > > > It is possible to have an SA that would use both crypto and  security devices.
> > > > Or to have an SA that would use multiple crypto devs
> > > > (though right now it is up the user level to do load-balancing logic).
> > > >
> > > > > For IPsec, If an application needs to use rte_security for HW
> > > > > implementation and and application needs to use librte_ipsec for
> > > > >  SW implementation then it is bad and a lot duplication of work on
> > > > > he slow path too.
> > > >
> > > > The plan is that application would need to use just rte_ipsec API for all data-paths
> > > > (HW/SW, lookaside/inline).
> > > > Let say right now there is rte_ipsec_inline_process() function if user
> > > > prefers to use inline security device to process given group packets,
> > > > and rte_ipsec_crypto_process(/prepare) if user decides to use
> > > > lookaside security or simple crypto device for it.
> > > >
> > > > >
> > > > > The rte_security spec can support both inline and look-aside IPSec
> > > > > protocol support.
> > > >
> > > > AFAIK right now rte_security just provides API to create/free/manipulate security sessions.
> > > > I don't see how it can support all the functionality mentioned above,
> > > > plus SAD and SPD.
> > >
> > >
> > > At least for INLINE_PROTOCOL case SA lookup for inbound traffic does by
> > > HW.
> >
> > For inbound yes, for outbound I suppose you still would need to do a lookup in SW.
> 
> Yes
> 
> >
> > >
> > > >
> > > > >
> > > > > 2) This library is tuned for fat CPU core in mind like single SA on core
> > > > > etc. Which is fine for x86 servers and arm64 server category of machines
> > > > > but it does not work very well with NPU class of SoC or FPGA.
> > > > >
> > > > > As there  are the different ways to implement the IPSec, For instance,
> > > > > use of eventdev can help in situation for handling millions of SA and
> > > > > equence number of update and anti reply check can be done by leveraging
> > > > > some of the HW specific features like
> > > > > ORDERED, ATOMIC schedule type(mapped as eventdev feature)in HW with PIPELINE model.
> > > > >
> > > > > # Issues with having one SA one core,
> > > > > - In the outbound side, there could be multiple flows using the same SA.
> > > > >   Multiple flows could be processed parallel on different lcores,
> > > > > but tying one SA to one core would mean we won't be able to do that.
> > > > >
> > > > > - In the inbound side, we will have a fat flow hitting one core. If
> > > > >   IPsec library assumes single core, we will not be able to to spread
> > > > > fat flow to multiple cores. And one SA-one core would mean all ports on
> > > > > which we would expect IPsec traffic has to be handled by that core.
> > > >
> > > > I suppose that all refers to the discussion about MT safe API for rte_ipsec, right?
> > > > If so, then as I said in my reply to Anoob:
> > > > We will try to make API usable in MT environment for v1,
> > > > so you can review and provide comments at early stages.
> > >
> > > OK
> > >
> > > >
> > > > >
> > > > > I have made a simple presentation. This presentation details ONE WAY to
> > > > > implement the IPSec with HW support on NPU.
> > > > >
> > > > > https://docs.google.com/presentation/d/1e3IDf9R7ZQB8FN16Nvu7KINuLSWMdyKEw8_0H05rjj4/edit?usp=sharing
> > > > >
> > > >
> > > > Thanks, quite helpful.
> > > > Actually from page 3, it looks like your expectations don't contradict in general with proposed API:
> > > >
> > > > ...
> > > > } else if (ev.event_type == RTE_EVENT_TYPE_LCORE && ev.sub_event_id == APP_STATE_SEQ_UPDATE) {
> > > >                         sa = ev.flow_queue_id;
> > > >                         /* do critical section work per sa */
> > > >                         do_critical_section_work(sa);
> > > >
> > > > [KA] that's the place where I expect either
> > > > rte_ipsec_inline_process(sa, ...); OR rte_ipsec_crypto_prepare(sa, ...);
> > > > would be called.
> > >
> > > Makes sense. But currently, the library defines what is
> > > rte_ipsec_inline_process() and rte_ipsec_crypto_prepare(), but it should
> > > be based on underneath security device or crypto device.
> >
> > Reason for that - their code-paths are quite different:
> > for inline devices we can do whole processing synchronously(within process() function),
> > while fro crypto it is sort of split into tw parts -
> > we first have to do prepare();enqueue() them to crypto-dev, and then dequeue();process().
> > Another good thing with that way - it allows the same SA to work with different devices.
> >
> > >
> > > So, IMO for better control, these functions should be the function pointer
> > > based and based on underlying device, library can fill the
> > > implementation.
> > >
> > > IMO, it is not possible to create "static inline function" with all "if"
> > > checks. I think, we can have four ipsec functions with function pointer
> > > scheme.
> > >
> > > rte_ipsec_inbound_prepare()
> > > rte_ipsec_inbound_process()
> > > rte_ipsec_outbound_prepare()
> > > rte_ipsec_outbound_process()
> > >
> > > Some of the other concerns:
> > > 1) For HW implementation, rte_ipsec_sa needs to opaque like rte_security
> > > as some of the structure defined by HW or Microcode. We can choose
> > > absolute generic items as common and device/rte_security specific can be opaque.
> >
> > I don't think it would be a problem, rte_ipsec_sa  does contain a pointer to
> > rte_security_session, so it can provide it as an argument to these functions.
> 
> The rte_ipsec_sa would need some private space for application to store
> it's metadata. There can be SA implementations with additional fields
> for faster lookups. To rephrase, the application should be given some
> provision to store some metadata it would need for faster lookups.
> may sa_init API can give amount private size required.
> 
> 
> >
> > >
> > > 2)I think, in order to accommodate the event drivern model. We need to pass
> > > void ** in prepare() and process() function with an additional argument
> > > of type(TYPE_EVENT/TYPE_MBUF) can be passed to detect packet object
> > > type as some of the functions in prepare() and process() may need
> > > rte_event to operate on.
> >
> > You are talking here about security device specific functions described below, correct?
> >
> > >
> > > >
> > > >                      /* Issue the crypto request and generate the following on crypto work completion */
> > > > [KA] that's the place where I expect rte_ipsec_crypto_process(...) be invoked.
> > > >
> > > >                         ev.flow_queue_id = tx_port;
> > > >                         ev.sub_event_id = tx_queue_id;
> > > >                         ev.sched_sync = RTE_SCHED_SYNC_ATOMIC;
> > > >                         rte_cryptodev_event_enqueue(cryptodev, ev.mbuf, eventdev, ev);
> > > >                 }
> > > >
> > > >
> > > > > I am not saying this should be the ONLY way to do as it does not work
> > > > > very well with non NPU/FPGA class of SoC.
> > > > >
> > > > > So how about making the proposed IPSec library as plugin/driver to
> > > > > rte_security.
> > > >
> > > > As I mentioned above, I don't think that pushing whole IPSec data-path into rte_security
> > > > is the best possible approach.
> > > > Though I probably understand your concern:
> > > > In RFC code we always do whole prepare/process in SW (attach/remove ESP headers/trailers, so paddings etc.),
> > > > i.e. right now only device types: RTE_SECURITY_ACTION_TYPE_NONE and RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO are
> covered.
> > > > Though there are devices where most of prepare/process can be done in HW
> > > > (RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL/RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL),
> > > > plus in future could be devices where prepare/process would be split between HW/SW in a custom way.
> > > > Is that so?
> > > > To address that issue I suppose we can do:
> > > > 1. Add support for RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL and RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL
> > > >     security devices into ipsec.
> > > >     We planned to do it anyway, just don't have it done yet.
> > > > 2. For custom case - introduce RTE_SECURITY_ACTION_TYPE_INLINE_CUSTOM and
> RTE_SECURITY_ACTION_TYPE_LOOKASIDE_CUSTOM
> > >
> > > The problem is, CUSTOM may have different variants and "if" conditions won't
> > > scale if we choose to have non function pointer scheme. Otherwise, it
> > > looks OK to create new SECURITY TYPE and associated plugin for prepare() and process()
> > > function in librte_ipsec library.
> >
> > In principle, I don't mind to always use function pointers for prepare()/process(), but:
> > from your description above of INLINE_PROTOCOL and LOOKASIDE_PROTOCOL
> > the process()/prepare() for such devices looks well defined and
> > straightforward to implement.
> > Not sure we'll need a function pointer for such simple and lightweight case:
> > set/check ol_flags, set/read userdata value.
> > I think extra function call here is kind of overkill and will only slowdown things.
> > But if that would be majority preference - I wouldn't argue.
> > BTW if we'll agree to always use function pointers for process/prepare,
> > then there is no point to have that all existing action types -
> > all we need is an indication is it inline or lookaside device and
> > function pointers for prepare/process().
> 
> Me too not a fan of function pointer scheme. But options are limited.
> 
> Though the generic usage seems straightforward, the implementation of
> the above modes can be very different. Vendors could optimize various
> operations (SQN update for example) for better performance on their
> hardware. Sticking to one approach would negate that advantage.
> 
> Another option would be to use multiple-worker model that Anoob had
> proposed some time back.
> https://mails.dpdk.org/archives/dev/2018-June/103808.html
> 
> Idea would be to make all lib_ipsec functions added as static inline
> functions.
> 
> static inline rte_ipsec_add_tunnel_hdr(struct rte_mbuf *mbuf);
> static inline rte_ipsec_update_sqn(struct rte_mbuf *mbuf, &seq_no);
> ...
> 
> For the regular use case, a fat
> rte_ipsec_(inbound/outbound)_(prepare/process) can be provided. The
> worker implemented for that case can directly call the function and
> forget about the other modes. For other vendors with varying
> capabilities, there can be multiple workers taking advantage of the hw
> features. For such workers, the static inline functions can be used as
> required. This gives vendors opportunity to pick and choose what they
> want from the ipsec lib. The worker to be used for that case will be
> determined based on the capabilities exposed by the PMDs.
> 
> https://mails.dpdk.org/archives/dev/2018-June/103828.html
> 
> The above email explains how multiple workers can be used with l2fwd.
> 
> For this to work, the application & library code need to be modularised.
> Like what is being done in the following series,
> https://mails.dpdk.org/archives/dev/2018-June/103786.html
> 
> This way one application can be made to run on multiple platforms, with
> the app being optimized for the platform on which it would run.
> 
> /* ST SA - RTE_SECURITY_ACTION_TYPE_NONE - CRYPTODEV - NO EVENTDEV*/
> worker1()
> {
>      while(true) {
>          nb_pkts = rte_eth_rx_burst();
> 
>          if (nb_pkts != 0) {
>              /* Do lookup */
>              rte_ipsec_inbound_prepare();
>              rte_cryptodev_enqueue_burst();
>              /* Update in-flight */
>          }
> 
>          if (in_flight) {
>              rte_cryptodev_dequeue_burst();
>              rte_ipsec_outbound_process();
>          }
>          /* route packet */
> }
> 
> #include <rte_ipsec.h>   /* For IPsec lib static inlines */
> 
> static inline rte_event_enqueue(struct rte_event *ev)
> {
>      ...
> }
> 
> /* MT safe SA - RTE_SECURITY_ACTION_TYPE_NONE - CRYPTODEV - EVENTDEV)
> worker2()
> {
>      while(true) {
>          nb_pkts = rte_eth_rx_burst();
> 
>          if (nb_pkts != 0) {
>              /* Do lookup */
>             rte_ipsec_add tunnel(ev->mbuf);
>             rte_event_enqueue(ev)
>             rte_cryptodev_enqueue_burst(ev->mbuf);
>              /* Update in-flight */
>          }
> 
>          if (in_flight) {
>              rte_cryptodev_dequeue_burst();
>              rte_ipsec_outbound_process();
>          }
>          /* route packet */
> }

Hmm, not sure how these 2 cases really differs in terms of ipsec processing.
I do understand the in second one we use events to propagate packets through the system,
and that eventdev might be smart enough to preserve packet ordering, etc.
But in terms of ipsec processing we have to do exactly the same for both cases.
Let say for the example above (outbound, crytpodev):
a) lookup an SA
b) increment SA.SQN and check for overflow
d) generate IV
e) generate & fill ESP header/trailer, tunnel header
f) perform actual encrypt, generate digest

So crypto_prepare() - deals with b)-e).
f) is handled by cryptodev. 
Yes, step b) might need to be atomic, or might not -
depends on particular application design.
But in both cases (polling/eventdev) we do need all these steps to be performed.
Konstantin

> 
> In short,
> 
> 1) Have separate small inline functions in library
> 2) If something can be grouped, it can be exposed a specific function
> to address a specific usecases
> 3) Let remaining code, can go in application as different worker() to
> address all the usecases.
> 
> >
> > Konstantin
> >
> > >
> > >
> > > >     and add into rte_security_ops   new functions:
> > > >     uint16_t lookaside_prepare(struct rte_security_session *sess, struct rte_mbuf *mb[], struct struct rte_crypto_op *cop[], uint16_t
> num);
> > > >     uint16_t lookaside_process(struct rte_security_session *sess, struct rte_mbuf *mb[], struct struct rte_crypto_op *cop[], uint16_t
> num);
> > > >     uint16_t inline_process(struct rte_security_session *sess, struct rte_mbuf *mb[], struct struct rte_crypto_op *cop[], uint16_t num);
> > > >     So for custom HW, PMD can overwrite normal prepare/process behavior.
> > > >
> > > > > This would give flexibly for each vendor/platform choose to different
> > > > > IPse implementation based on HW support WITHOUT CHANGING THE APPLICATION
> > > > > INTERFACE.
> > > >
> > > > Not sure what API changes you are referring to?
> > > > As I am aware we do introduce new API, but all existing APIs remain in place.
> > >
> > >
> > > What I meant was, Single application programming interface to enable IPSec processing to
> > > application.
> > >
> > >
> > > >
> > > > >
> > > > > IMO, rte_security IPsec look aside support can be simply added by
> > > > > creating the virtual crypto device(i.e move the proposed code to the virtual crypto device)
> > > > > likewise inline support
> > > > > can be added by the virtual ethdev device.
> > > >
> > > > That's probably possible and if someone would like to introduce such abstraction - NP in general
> > > > (though my suspicion - it might be too heavy to be really useful).
> > > > Though I don't think it should be the only possible way for the user to enable IPsec data-processing inside his app.
> > > > Again I guess such virtual-dev will still use rte_ipsec inside.
> > >
> > > I don't have strong opinion on virtual devices VS function pointer based
> > > prepare() and process() function in librte_ipsec library.
> > >
> > > >
> > > > > This would avoid the need for
> > > > > updating ipsec-gw application as well i.e unified interface to application.
> > > >
> > > > I think - it would  really good to simplify existing ipsec-secgw sample app.
> > > > Some parts of it seems unnecessary complex to me.
> > > > One of the reasons for it -  we don't really have an unified (and transparent) API for ipsec data-path.
> > > > Let's look at ipsec_enqueue() and related code (examples/ipsec-secgw/ipsec.c:365)
> > > > It is huge (and ugly) -  user has to handle dozen different cases just to enqueue packet for IPsec processing.
> > > > One of the aims of rte_ipsec library - hide all that complexities inside the library and provide to
> > > > the upper layer clean and transparent API.
> > > >
> > > > >
> > > > > If you don't like the above idea, any scheme of plugin based
> > > > > implementation would be fine so that vendor or platform can choose its own implementation.
> > > > > It can be based on partial HW implement too. i.e SA look can be used in SW, remaining stuff in HW
> > > > > (for example IPsec inline case)
> > > >
> > > > I am surely ok with the idea to give vendors an ability to customize implementation
> > > > and enable their HW capabilities.
> > >
> > > I think, We are on the same page, just that the fine details of "framework"
> > > for customizing implementation based on their HW capabilities need to
> > > iron out.
> > >
> > > > Do you think proposed additions to the rte_security would be  enough,
> > > > or something extra is needed?
> > >
> > > See above.
> > >
> > > Jerin
> > >
> > > >
> > > > Konstantin
> > > >
> > > >
> > > > >
> > > > > # For protocols like UDP, it makes sense to create librte_udp as there
> > > > > no much HW specific offload other than ethdev provides.
> > > > >
> > > > > # PDCP could be another library to offload to HW, So talking
> > > > > rte_security path makes more sense in that case too.
> > > > >
> > > > > Jerin


More information about the dev mailing list