[dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem

Ori Kam orika at mellanox.com
Sun Feb 23 13:33:56 CET 2020


Hi Jerin,

Best,
Ori
> -----Original Message-----
> From: Jerin Jacob <jerinjacobk at gmail.com>
> Sent: Sunday, February 23, 2020 11:54 AM
> To: Ori Kam <orika at mellanox.com>
> Cc: Jerin Jacob <jerinj at marvell.com>; xiang.w.wang at intel.com; dpdk-dev
> <dev at dpdk.org>; Pavan Nikhilesh <pbhagavatula at marvell.com>; Shahaf
> Shuler <shahafs at mellanox.com>; Hemant Agrawal
> <hemant.agrawal at nxp.com>; Opher Reviv <opher at mellanox.com>; Alex
> Rosenbaum <alexr at mellanox.com>; dovrat at marvell.com; Prasun Kapoor
> <pkapoor at marvell.com>; Nipun Gupta <nipun.gupta at nxp.com>; Richardson,
> Bruce <bruce.richardson at intel.com>; yang.a.hong at intel.com;
> harry.chang at intel.com; gu.jian1 at zte.com.cn; shanjiangh at chinatelecom.cn;
> zhangy.yun at chinatelecom.cn; lixingfu at huachentel.com; wushuai at inspur.com;
> yuyingxia at yxlink.com; fanchenggang at sunyainfo.com;
> davidfgao at tencent.com; liuzhong1 at chinaunicom.cn;
> zhaoyong11 at huawei.com; oc at yunify.com; jim at netgate.com;
> hongjun.ni at intel.com; j.bromhead at titan-ic.com; deri at ntop.org;
> fc at napatech.com; arthur.su at lionic.com; Thomas Monjalon
> <thomas at monjalon.net>
> Subject: Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
> 
> On Sun, Feb 23, 2020 at 2:12 PM Ori Kam <orika at mellanox.com> wrote:
> >
> > Hi Jerin,
> >
> > Thanks, for the review.
> > PSB
> 
> 
> Hi Ori
> 
> Since we are finalizing the specification part, I thought of
> enumerating the list of work needs to be
> completed for a new subsystem in DPDK.
> 
> 0) Finalize the first version of the spec. Hope v4 will do that.
> 1) Introduce common library code for based on the  specification
> 2) One HW based driver implementation
> 3) One SW reference driver: libpcre library provides complete PCRE
> functionality.
> 4) app/test/test_regexdev.c like app/test/test_eventdev.c
> 5) Need a maintainer for maintaining the regex subsystem
> 6) The first version programming guide documentation
> 7) Add app/test-regexdev like app/test-eventdev
> 8) Add an examples/xxxxxx program
> 

> IMO The following items need to be completed to accept a subsystem in
> dpdk(Need at least on HW and SW driver).
> 
> 0) Finalize the first version of the spec. Hope v4 will do that.

I hope so to 😊

> 1) Introduce common library code for based on the  specification

I'm working on it. as soon as we agree on the API (this RFC will get acked ) I can work on this code.
I will send the entire code for ack when we decide if it will be part of 
20.05 or 20.08.

> 2) One HW based driver implementation

Just like you, our driver will be ready by 20.05 or 20.08

> 3) One SW reference driver: libpcre library provides complete PCRE
> functionality.

O.K. We are not working on this part.

> 4) app/test/test_regexdev.c like app/test/test_eventdev.c

We started to create a super basic app, after the API will be finalized and we will have HW
we can push it. (if you need it faster than feel free)

> 5) Need a maintainer for maintaining the regex subsystem
> 
We wish to maintain it if you agree.

> We have item (3) so Marvell would like to work on item (3). Our HW
> driver may ready by v20.05 or the worst case by 20.08.
> Let us know what other items Mellanox or community would like to work
> on. This is to avoid duplication of work
> to get clarity on the next steps.
> 

See my comments above.
From Mellanox the best date is 20.08 but we are trying to make it to 20.05, 
depended on HW.


> PSB
> 
> 
> >
> >
> > Ori Kam
> >
> > > -----Original Message-----
> > > From: Jerin Jacob <jerinjacobk at gmail.com>
> > > Sent: Saturday, February 22, 2020 6:52 PM
> > > To: Ori Kam <orika at mellanox.com>
> > > Cc: Jerin Jacob <jerinj at marvell.com>; xiang.w.wang at intel.com; dpdk-dev
> > > <dev at dpdk.org>; Pavan Nikhilesh <pbhagavatula at marvell.com>; Shahaf
> > > Shuler <shahafs at mellanox.com>; Hemant Agrawal
> > > <hemant.agrawal at nxp.com>; Opher Reviv <opher at mellanox.com>; Alex
> > > Rosenbaum <alexr at mellanox.com>; dovrat at marvell.com; Prasun Kapoor
> > > <pkapoor at marvell.com>; Nipun Gupta <nipun.gupta at nxp.com>;
> Richardson,
> > > Bruce <bruce.richardson at intel.com>; yang.a.hong at intel.com;
> > > harry.chang at intel.com; gu.jian1 at zte.com.cn;
> shanjiangh at chinatelecom.cn;
> > > zhangy.yun at chinatelecom.cn; lixingfu at huachentel.com;
> wushuai at inspur.com;
> > > yuyingxia at yxlink.com; fanchenggang at sunyainfo.com;
> > > davidfgao at tencent.com; liuzhong1 at chinaunicom.cn;
> > > zhaoyong11 at huawei.com; oc at yunify.com; jim at netgate.com;
> > > hongjun.ni at intel.com; j.bromhead at titan-ic.com; deri at ntop.org;
> > > fc at napatech.com; arthur.su at lionic.com; Thomas Monjalon
> > > <thomas at monjalon.net>
> > > Subject: Re: [dpdk-dev] [PATCH v3] regexdev: introduce regexdev subsystem
> > >
> > > > diff --git a/lib/librte_regexdev/rte_regexdev.h
> > > b/lib/librte_regexdev/rte_regexdev.h
> > > > new file mode 100644
> > > > index 0000000..c42128b
> > > > --- /dev/null
> > > > +++ b/lib/librte_regexdev/rte_regexdev.h
> > > > @@ -0,0 +1,1411 @@
> > > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > > + * Copyright(C) 2019 Marvell International Ltd.
> > > > + * Copyright(C) 2020 Mellanox International Ltd.
> > >
> > > There are a few comments from Xiang as well. So let's add Intel also
> > > to the list.
> > >
> >
> > Sure no problem.
> 
> Thanks
> 
> >
> > > > + */
> > > > +
> > > > +#ifndef _RTE_REGEXDEV_H_
> > > > +#define _RTE_REGEXDEV_H_
> > >
> > > > +
> > > > +/**
> > > > + * RegEx device information
> > > > + */
> > > > +struct rte_regex_dev_info {
> > > > +       const char *driver_name; /**< RegEx driver name. */
> > > > +       struct rte_device *dev; /**< Device information. */
> > > > +       uint16_t max_matches;
> > > > +       /**< Maximum matches per scan supported by this device. */
> > > > +       uint16_t max_queue_pairs;
> > > > +       /**< Maximum queue pairs supported by this device. */
> > > > +       uint16_t max_payload_size;
> > > > +       /**< Maximum payload size for a pattern match request or scan.
> > > > +        * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> > > > +        */
> > > > +       uint32_t max_rules_per_group;
> > > > +       /**< Maximum rules supported per group by this device.
> > > > +        * This number can't be larger then 20 bits.
> > >
> > > s/then/than
> > >
> > > I think, we don't need to say this " This number can't be larger than 20 bits."
> > > It may help SW drivers.
> > >
> >
> > Agree I will remove the 20 bits part.
> >
> > >
> > >
> > > > +        */
> > > > +       uint16_t max_groups;
> > > > +       /**< Maximum group supported by this device.
> > > > +        * This number can't be larger then 12 bits.
> > > s/then/than
> > > I think, we don't need to say this " This number can't be larger than 12 bits."
> > > It may help SW drivers.
> > >
> >
> > Agree will remove the 12 bits part.
> >
> > > > +        */
> > > > +       uint32_t regex_dev_capa;
> > > > +       /**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
> > > > +       uint64_t rule_flags;
> > > > +       /**< Supported compiler rule flags.
> > > > +        * @see RTE_REGEX_PCRE_RULE_*, struct
> rte_regex_rule::rule_flags
> > > > +        */
> > > > +       uint8_t max_scatter_gather;
> > > > +       /**< The max supported number of buffers that can
> > > > +        * be used in a single ops. The total size of all elements
> > > > +        * must be less then max_payload_size.
> > > > +        */
> > > > +};
> > > <snip>
> > >
> > > > +int
> > > > +rte_regex_rule_db_compile(uint8_t dev_id);
> > > > +
> > >
> > > I think your "rte_regex_rule_db_compile_activate() - compile and
> > > activate the new rule set"
> > > API name looks good. I am for rte_regex_rule_db_compile_activate().
> > >
> >
> > I like your name, will change to compile_activate.
> 
> Ack.
> 
> >
> > >
> > > > +/* Fast path APIs */
> > > > +
> > > > +/**
> > > > + * The generic *rte_regex_match* structure to hold the RegEx match
> > > attributes.
> > > > + * @see struct rte_regex_ops::matches
> > > > + */
> > > > +struct rte_regex_match {
> > > > +       RTE_STD_C11
> > > > +       union {
> > > > +               uint64_t u64;
> > > > +               struct {
> > > > +                       uint32_t rule_id:20;
> > > > +                       /**< Rule identifier to which the pattern matched.
> > > > +                        * @see struct rte_regex_rule::rule_id
> > > > +                        */
> > > > +                       uint32_t group_id:12;
> > > > +                       /**< Group identifier of the rule which the pattern
> > > > +                        * matched. @see struct rte_regex_rule::group_id
> > > > +                        */
> > > > +                       uint16_t offset;
> > >
> > > Since we have end_offset now, IMO, it is better to change this offset
> > > to "start_offset".
> > >
> >
> > Agree, will change.
> >
> > >
> > > > +                       /**< Starting Byte Position for matched rule. */
> > > > +                       RTE_STD_C11
> > > > +                       union {
> > > > +                               uint16_t len;
> > > > +                               /**< Length of match in bytes */
> > > > +                               uint16_t end_offset;
> > > > +                               /**< The end offset of the match. In case
> > > > +                                * MATCH_AS_START configuration is disabled.
> > > > +                                * @see RTE_REGEX_DEV_CFG_MATCH_AS_START
> > > > +                                */
> > >
> > > We have not concluded on this scheme. Have one field which has
> > > different meaning will be difficult
> > > for application. i.e fast path we need to have a check for this.
> > >
> >
> > This is the time to conclude . at least for the first version.
> > Why do we have one field with different meaning?
> > The result can be ether len or end_offset.
> >
> > > I think, Based on the majority of HW/SW implementation, we need to
> > > either go with len or
> > > end_offset. What Mellanox HW returns? len or end_offset?
> > >
> >
> > From Mellanox perspective we prefer the len approach. We also think
> > it is much more user oriented.
> >
> > > or We can keep it as len or end_offset based on which drivers upstream
> first,
> > > other drivers when it comes, we can see how to abstract it?
> > >
> >
> > I can except that assuming we choose the start and len approach
> 
> I think, we can have first version with "start and len" by removing
> RTE_REGEX_DEV_CFG_MATCH_AS_START.
> When can think, how to abstract new drivers when it upstream based on
> the overhead.
> 

Perfect

> 
> >
> > > > +                       };
> > > > +               };
> > > > +       };
> > > > +};
> > >
> > > > +/**
> > > > + * The generic *rte_regex_ops* structure to hold the RegEx attributes
> > > > + * for enqueue and dequeue operation.
> > > > + */
> > > > +struct rte_regex_ops {
> > > > +       /* W0 */
> > > > +       uint16_t req_flags;
> > > > +       /**< Request flags for the RegEx ops.
> > > > +        * @see RTE_REGEX_OPS_REQ_*
> > > > +        */
> > > > +       uint16_t rsp_flags;
> > > > +       /**< Response flags for the RegEx ops.
> > > > +        * @see RTE_REGEX_OPS_RSP_*
> > > > +        */
> > > > +       uint16_t nb_actual_matches;
> > > > +       /**< The total number of actual matches detected by the Regex
> > > device.*/
> > > > +       uint16_t nb_matches;
> > > > +       /**< The total number of matches returned by the RegEx device for
> this
> > > > +        * scan. The size of *rte_regex_ops::matches* zero length array will
> be
> > > > +        * this value.
> > > > +        *
> > > > +        * @see struct rte_regex_ops::matches, struct rte_regex_match
> > > > +        */
> > > > +
> > > > +       /* W1 */
> > > > +       uint16_t num_of_bufs;
> > > > +       /**< The number of bufs that are part of this ops. The total size of
> > > > +        * the length of all the buffer must be smaller then the max buffer
> > > > +        * len.
> > > > +        */
> > > > +       uint16_t resv1;
> > > > +       uint32_t resv2;
> > >
> > > One of the point came up in our implementation is that.
> > > HW can return an error due to various reasons.
> > >
> > > One option could be to make nb_matches as zero? and update some flag?
> > >
> > > What are your thoughts? updating the flag may be overkill.
> > >
> >
> > I think we can return just zero matches for now.
> 
> Ack.
> 
> >
> > >
> > > > +
> > > > +       /* W2 */
> > > > +       struct rte_regex_iov *(*bufs)[];
> > > > +       /**< Holds a pointer to the buffers list.*/
> > >
> > > This memory gets submitted to HW so it can not be from the heap.
> > > Cryptodev had a similar dilemma to use the container format for
> > > multi-segment case, Finally they choose to with mbuf.
> > >
> > > The following elements are in mbuf. Considering to avoid duplication and
> > > avoid overhead most common usecase DPI(Assume if it is rte_regex_iov,
> > > one need to copy all the elements from mbuf on fastpath).
> > > I propose to have mbuf here instead of rte_regex_iov.
> > >
> >
> > The application only needs to set the data pointers. (no copy is required. )
> > I agree that there are advantages to the mbuf approach.
> > The main limitation for the mbufs approach is that the user will need to play
> with the offset
> > pointers and pointers to the next mbuf, in order to support cross buffer.
> > For example we have a packet and we want to add to the scan also the last
> part of the previous packet,
> > this means that the application must modify the data offset in the previous
> packet mbuf including
> > changing the next pointer to point to the head of the new packet, and then
> return the values to the original position.
> >
> > What do you think?
> > We can start with mbufs and see how it works, or start with the buffer and
> see how it works.
> 
> I think, we can start with mbuf to align with other subsystems. We
> will see later the use case for struct rte_regex_iov.
>

Agree.
 
> 
> >
> >
> >
> >
> > > struct rte_regex_iov {
> > > RTE_STD_C11
> > > union {
> > > uint64_t u64;
> > > /**<  Allow 8-byte reserved on 32-bit system */
> > > void *buf_addr;
> > > /**< Virtual address of the pattern to be matched. */
> > > };
> > > rte_iova_t buf_iova;
> > > /**< IOVA address of the pattern to be matched. */
> > > uint16_t buf_size; /**< The buf size. */
> > > };
> > >
> > >
> > > > +
> > > > +       /* W5 */
> > > > +       RTE_STD_C11
> > > > +       union {
> > > > +               uint64_t cross_buf_id;
> > > > +               /**< ID used by the RegEx device in order to handle cross
> > > > +                * buffer detection.
> > > > +                * This ID is given by the RegEx device on dequeue, and
> > > > +                * the application must send it on the following enque.
> > > > +                */
> > > > +               void *cross_buf_ptr;
> > > > +               /**< Pointer representation of *cross_buf_id* */
> > >
> > > Could you have some example of how to use cross_buf_id?
> > > Marvell HW does not support cross_buf_id, so we need to add this
> > > feature as capability.
> > >
> >
> > The idea is that this buffer will be used to keep some internal data for the
> engine.
> > For example the current state and what was found until now, and then reuse
> this
> > for the next buffer.
> > We can remove it for now if we agree that we can add it later.
> 
> I think, adding it later would be better so that we can see how to
> abstract it well.
> 

Agree.

> >
> >
> >
> > One more thing, regarding the ops structure, I think it is better to split it to 2
> different
> > structures one enque and one for dequeue, since there are no real shared
> data and we will
> > be able to save memory, what do you think?
> 
> Ops are allocated from mempool so it will be overhead to manage both.
> moreover, some
> of the fields added in req can be used for resp as info. cryptodev
> follows the similar concept,
> I think, we can have symmetry with cryptodev wherever is possible to avoid
> end-user to learn new API models.

True that there will be overhead with 2 mempools (small one)
but lets assume 255 results. This means that the buffer should be 255 * sizeof(rte_regex_match) = 2K
also this will enable us to replace groupX with group[] which will allow even more groups.
In addition don't think that crypto is a good example.
The main difference is that in RegEx the output is different format then the input.


> 
> 
> 
> >
> > >
> > >
> > > Regarding the rule attributes, We think, following needs to be added to
> control
> > > the rule compilation behavior. If it not converging in first look, I
> > > think, we can make
> > > below as separate patch once we have basic things.
> > >
> >
> > Regarding the new code, we need also to add a function to get the
> capabilities for the compiler or
> > add a new field in the dev_info which will report the complier supported
> features.
> 
> I agree. Lets remove this new code from the first version. We can add
> it later with capability as
> a new patch.
> 

Agree

> I assume you will send the v4 with these comments. I think, with v4 we
> can start implementing common library code.

Just need to agree on the split (one more iteration 😊)
and I will start working on the common code.



More information about the dev mailing list