[dpdk-dev] [RFC] ipsec: new library for IPsec data-path processing
    Joseph, Anoob 
    Anoob.Joseph at caviumnetworks.com
       
    Mon Sep 17 16:41:31 CEST 2018
    
    
  
Hi Konstantin,
On 17-09-2018 16:06, Ananyev, Konstantin wrote:
> External Email
>
> Hi Anoob,
>
>> Hi Konstantin,
>> Please see inline.
>>
>>
>> This RFC introduces a new library within DPDK: librte_ipsec.
>> The aim is to provide DPDK native high performance library for IPsec
>> data-path processing.
>> The library is supposed to utilize existing DPDK crypto-dev and
>> security API to provide application with transparent IPsec processing API.
>> The library is concentrated on data-path protocols processing (ESP and AH),
>> IKE protocol(s) implementation is out of scope for that library.
>> Though hook/callback mechanisms will be defined to allow integrate it
>> with existing IKE implementations.
>> Due to quite complex nature of IPsec protocol suite and variety of user
>> requirements and usage scenarios a few API levels will be provided:
>> 1) Security Association (SA-level) API
>>       Operates at SA level, provides functions to:
>>       - initialize/teardown SA object
>>       - process inbound/outbound ESP/AH packets associated with the given SA
>>         (decrypt/encrypt, authenticate, check integrity,
>>          add/remove ESP/AH related headers and data, etc.).
>> 2) Security Association Database (SAD) API
>>       API to create/manage/destroy IPsec SAD.
>>       While DPDK IPsec library plans to have its own implementation,
>>       the intention is to keep it as independent from the other parts
>>       of IPsec library as possible.
>>       That is supposed to give users the ability to provide their own
>>       implementation of the SAD compatible with the other parts of the
>>       IPsec library.
>> 3) IPsec Context (CTX) API
>>       This is supposed to be a high-level API, where each IPsec CTX is an
>>       abstraction of 'independent copy of the IPsec stack'.
>>       CTX owns set of SAs, SADs and assigned to it crypto-dev queues, etc.
>>       and provides:
>>       - de-multiplexing stream of inbound packets to particular SAs and
>>         further IPsec related processing.
>>       - IPsec related processing for the outbound packets.
>>       - SA add/delete/update functionality
>> [Anoob]: Security Policy is an important aspect of IPsec. An IPsec
>> library without Security Policy API would be incomplete. For inline
>> protocol offload, the final SP-SA check(selector check) is the only
>> IPsec part being done by ipsec-secgw now. Would make sense to add that
>> also in the library.
>>
>> You mean here, that we need some sort of SPD implementation, correct?
>> [Anoob] Yes.
>>
>> Ok, I see.
>> Our thought was that just something based on librte_acl would be enough here...
>> But if you think that a special defined SPD API (and implementation) is needed -
>> we can probably discuss it along with SAD API (#2 above).
>> Though if you'd like to start to work on RFC for it right-away - please feel free to do so :)
>>
>>
>>
>>
>> Current RFC concentrates on SA-level API only (1),
>> detailed discussion for 2) and 3) will be subjects for separate RFC(s).
>>
>> SA (low) level API
>> ==================
>>
>> API described below operates on SA level.
>> It provides functionality that allows user for given SA to process
>> inbound and outbound IPsec packets.
>> To be more specific:
>> - for inbound ESP/AH packets perform decryption, authentication,
>>     integrity checking, remove ESP/AH related headers
>> [Anoob] Anti-replay check would also be required.
>>
>> Yep, anti-replay and ESN support is implied as part of "integrity checking".
>> Probably I have to be more specific here.
>> [Anoob] This is fine.
>>
>>
>>
>> - for outbound packets perform payload encryption, attach ICV,
>>     update/add IP headers, add ESP/AH headers/trailers,
>>     setup related mbuf felids (ol_flags, tx_offloads, etc.).
>> [Anoob] Do we have any plans to handle ESN expiry? Some means to
>> initiate an IKE renegotiation? I'm assuming application won't be aware
>> of the sequence numbers, in this case.
>> [Anoob] What is your plan with events like ESN expiry? IPsec spec talks about byte and time expiry as well.
>>
>> At current moment, for SA level: rte_ipsec_crypto_prepare()/rte_ipsec_inline_process() will set rte_errno
>> to special value (EOVERFLOW) to signal upper layer that limit is reached.
>> Upper layer can decide to start re-negotiation, or just destroy an SA.
>>
>> Future plans for IPsec Context (CTX) API (#3 above):
>> Introduce a special function, something like:
>> rte_ipsec_get_expired(rte_ipsec_ctx *ctx, rte_ipsec_sa *expired_sa[], uint32_t num);
>> It would return up-to *num* of SAs for given ipsec context, that are expired/limit reached.
>> Then upper layer again might decide for each SA should renegotiation be started,
>> or just wipe given SA.
>> It would be upper layer responsibility to call this function periodically.
>>
>>
>>
>> - initialize/un-initialize given SA based on user provided parameters.
>>
>> Processed inbound/outbound packets could be grouped by user provided
>> flow id (opaque 64-bit number associated by user with given SA).
>>
>> SA-level API is based on top of crypto-dev/security API and relies on them
>> to perform actual cipher and integrity checking.
>> Due to the nature of crypto-dev API (enqueue/deque model) we use
>> asynchronous API for IPsec packets destined to be processed
>> by crypto-device:
>> rte_ipsec_crypto_prepare()->rte_cryptodev_enqueue_burst()->
>> rte_cryptodev_dequeue_burst()->rte_ipsec_crypto_process().
>> Though for packets destined for inline processing no extra overhead
>> is required and simple and synchronous API: rte_ipsec_inline_process()
>> is introduced for that case.
>> [Anoob] The API should include event-delivery as a crypto-op completion
>> mechanism as well. The application could configure the event crypto
>> adapter and then enqueue and dequeue to crypto device using events (via
>> event dev).
>>
>> Not sure what particular extra API you think is required here?
>> As I understand in both cases (with or without event crypto-adapter) we still have to:
>>   1) fill crypto-op properly
>>   2) enqueue it to crypto-dev (via eventdev or directly)
>> 3)  receive processed by crypto-dev crypto-op (either via eventdev or directly)
>> 4) check crypto-op status, do further post-processing if any
>>
>> So #1 and #4 (SA-level API respnibility) remain the same for both cases.
>> [Anoob] rte_ipsec_inline_process works on packets not events. We might need a similar API which processes events.
>>
>> Ok, I still don't get you here.
>> Could you specify what exactly function you'd like to add to the API here with parameter list
>> and brief behavior description?
>>
>>
>> The following functionality:
>>     - match inbound/outbound packets to particular SA
>>     - manage crypto/security devices
>>     - provide SAD/SPD related functionality
>>     - determine what crypto/security device has to be used
>>       for given packet(s)
>> is out of scope for SA-level API.
>>
>> Below is the brief (and simplified) overview of expected SA-level
>> API usage.
>>
>> /* allocate and initialize SA */
>> size_t sz = rte_ipsec_sa_size();
>> struct rte_ipsec_sa *sa = rte_malloc(sz);
>> struct rte_ipsec_sa_prm prm;
>> /* fill prm */
>> rc = rte_ipsec_sa_init(sa, &prm);
>> if (rc != 0) { /*handle error */}
>> .....
>>
>> /* process inbound/outbound IPsec packets that belongs to given SA */
>>
>> /* inline IPsec processing was done for these packets */
>> if (use_inline_ipsec)
>>          n = rte_ipsec_inline_process(sa, pkts, nb_pkts);
>> /* use crypto-device to process the packets */
>> else {
>>        struct rte_crypto_op *cop[nb_pkts];
>>        struct rte_ipsec_group grp[nb_pkts];
>>
>>         ....
>>        /* prepare crypto ops */
>>        n = rte_ipsec_crypto_prepare(sa, pkts, cops, nb_pkts);
>>        /* enqueue crypto ops to related crypto-dev */
>>        n =  rte_cryptodev_enqueue_burst(..., cops, n);
>>        if (n != nb_pkts) { /*handle failed packets */}
>>        /* dequeue finished crypto ops from related crypto-dev */
>>        n = rte_cryptodev_dequeue_burst(..., cops, nb_pkts);
>>        /* finish IPsec processing for associated packets */
>>        n = rte_ipsec_crypto_process(cop, pkts, grp, n);
>> [Anoob] Does the SA based grouping apply to both inbound and outbound?
>>
>> Yes, the plan is to have it available for both cases.
>> [Anoob] On the inbound, shouldn't the packets be grouped+ordered based on inner L3+inner L4?
>>
>> I think that's up to the user decide based on what criteria wants to group it and does he wants
>> to do any grouping at all.
>> That's why flowid is user-defined and totally transparent to the lib.
>>
>>
>>
>>
>>        /* now we have <n> group of packets grouped by SA flow id  */
>>       ....
>>    }
>> ...
>>
>> /* uninit given SA */
>> rte_ipsec_sa_fini(sa);
>>
>> Planned scope for 18.11:
>> ========================
>>
>> - SA-level API definition
>> - ESP tunnel mode support (both IPv4/IPv6)
>> - Supported algorithms: AES-CBC, AES-GCM, HMAC-SHA1, NULL.
>> - UT
>> [Anoob] What is UT?
>>
>> Unit-Test
>>
>>
>> Note: Still WIP, so not all planned for 18.11 functionality is in place.
>>
>> Post 18.11:
>> ===========
>> - ESP transport mode support (both IPv4/IPv6)
>> - update examples/ipsec-secgw to use librte_ipsec
>> - SAD and high-level API definition and implementation
>>
>>
>> Signed-off-by: Mohammad Abdul Awal <mohammad.abdul.awal at intel.com>
>> Signed-off-by: Declan Doherty <declan.doherty at intel.com>
>> Signed-off-by: Konstantin Ananyev <konstantin.ananyev at intel.com>
>> ---
>>    config/common_base                     |   5 +
>>    lib/Makefile                           |   2 +
>>    lib/librte_ipsec/Makefile              |  24 +
>>    lib/librte_ipsec/meson.build           |  10 +
>>    lib/librte_ipsec/pad.h                 |  45 ++
>>    lib/librte_ipsec/rte_ipsec.h           | 245 +++++++++
>>    lib/librte_ipsec/rte_ipsec_version.map |  13 +
>>    lib/librte_ipsec/sa.c                  | 921 +++++++++++++++++++++++++++++++++
>>    lib/librte_net/rte_esp.h               |  10 +-
>>    lib/meson.build                        |   2 +
>>    mk/rte.app.mk                          |   2 +
>>    11 files changed, 1278 insertions(+), 1 deletion(-)
>>    create mode 100644 lib/librte_ipsec/Makefile
>>    create mode 100644 lib/librte_ipsec/meson.build
>>    create mode 100644 lib/librte_ipsec/pad.h
>>    create mode 100644 lib/librte_ipsec/rte_ipsec.h
>>    create mode 100644 lib/librte_ipsec/rte_ipsec_version.map
>>    create mode 100644 lib/librte_ipsec/sa.c
>> <snip>
>> +static inline uint16_t
>> +esp_outb_tun_prepare(struct rte_ipsec_sa *sa, struct rte_mbuf *mb[],
>> +       struct rte_crypto_op *cop[], uint16_t num)
>> +{
>> +       int32_t rc;
>> +       uint32_t i, n;
>> +       union sym_op_data icv;
>> +
>> +       n = esn_outb_check_sqn(sa, num);
>> +
>> +       for (i = 0; i != n; i++) {
>> +
>> +               sa->sqn++;
>> [Anoob] Shouldn't this be done atomically?
>>
>> If we want to have MT-safe API for SA-datapath API, then yes.
>> Though it would make things more complicated here, especially for inbound with anti-replay support.
>> I think it is doable (spin-lock?), but would cause extra overhead and complexity.
>> Right now I am not sure it really worth it - comments/suggestions are welcome.
>> What probably could be a good compromise - runtime decision per SA basis (at sa_init()),
>> do we need an ST or MT behavior for given SA.
>> [Anoob] Going with single thread approach would significantly limit the scope of this library. Single thread approach would mean
>> one SA on one core. This would not work on practical cases.
>> Suppose we have two flows which are supposed to use the same SA. With RSS, these flows could end up on different cores. Now
>> only one core would be able to process, as SA will not be shared. We have the same problem in ipsec-secgw too.
>>
>> Just for my curiosity - how do you plan to use RSS for ipsec packet distribution?
>> Do you foresee a common situation when there would be packets that belongs to the same SA
>> (same SPI) but with multiple source(destination) IP addresses?
>> If so, probably some examples would be helpful.
>> I think IPsec RFCs doesn't prevent such situation, but AFAIK the most common case - single source/destination IPs for the same SPI.
>>
>> sp ipv4 out esp protect 6 pri 1 dst 192.168.1.0/24 sport 0:65535 dport 0:65535
>> sp ipv4 out esp protect 6 pri 1 dst 192.168.2.0/24 sport 0:65535 dport 0:65535
>> sa out 6 cipher_algo aes-128-cbc cipher_key 22:33:44:55:66:77:88:99:aa:bb:cc:dd:ee:ff:00:11 auth_algo sha1-hmac auth_key
>> 22:33:44:55:66:77:88:99:aa:bb:cc:dd:ee:ff:00:11:22:33:44:55 mode ipv4-tunnel src 172.16.2.1 dst 172.16.1.1
>> Isn't this a valid configuration? Wouldn't this be a common use case when we have site-to-site tunneling?
>> https://tools.ietf.org/html/rfc4301#section-4.4.1.1
> Ok, I think I understand what was my confusion here - above you talked about using RSS to distribute incoming *outbound* traffic, correct?
> If so, then yes I think such scheme would work without problems.
> My original thought was that we are talking about inbound traffic distribution here - in that case standard RSS wouldn't help much.
Agreed. RSS won't be of much use in that case (inbound). But fat flow 
hitting one core would be a problem which we should solve. RSS will help 
in solving the same problem with outbound, to an extent.
>> Anyway, let's pretend we found some smart way to distribute inbound packets for the same SA to multiple HW queues/CPU cores.
>> To make ipsec processing for such case to work correctly just atomicity on check/update segn/replay_window is not enough.
>> I think it would require some extra synchronization:
>> make sure that we do final packet processing (seq check/update) at the same order as we received the packets
>> (packets entered ipsec processing).
>> I don't really like to introduce such heavy mechanisms on SA level,  after all it supposed to be light and simple.
>> Though we plan CTX level API to support such scenario.
>> What I think would be useful addition for SA level API - have an ability to do one update seqn/replay_window and multiple checks
>> concurrently.
>>
>> In case of ingress also, the same problem exists. We will not be able to use RSS and spread the traffic to multiple cores. Considering
>> IPsec being CPU intensive, this would limit the net output of the chip.
>>
>> That's true - but from other side implementation can offload heavy part
>> (encrypt/decrypt, auth) to special HW (cryptodev).
>> In that case single core might be enough for SA and extra synchronization would just slowdown things.
>> That's why I think it should be configurable  what behavior (ST or MT) to use.
>> I do agree that these are the issues that we need to address to make the library MT safe. Whether the extra synchronization would
>> slow down things is a very subjective question and will heavily depend on the platform. The library should have enough provisions
>> to be able to support MT without causing overheads to ST. Right now, the library assumes ST.
> Ok, I suppose we both agree that we need ST and MT case supported.
> I didn't want to introduce MT related code right now (for 18.11), but as you guys seems very concerned about it,
> we will try to add MT related stuff into v1, so you can review it at early stages.
> Konstantin
Glad that we are on the same page. As you had pointed out, MT is not 
just about adding some locks. It's more complicated than that. And again 
for solving it, there could be multiple ways. Using locks and software 
queues etc would definitely be one way. But that's not the only 
solution. As Jerin had pointed out, there are parts in IPsec flow which 
can be offloaded to hardware.
http://mails.dpdk.org/archives/dev/2018-September/111770.html
Can you share your thoughts on the above approach?
Anoob
    
    
More information about the dev
mailing list