[dpdk-dev] [RFC v6] regexdev: introduce regexdev subsystem

Wang Xiang xiang.w.wang at intel.com
Mon Mar 16 22:10:40 CET 2020


Hi Ori,

Yes, please go ahead with the patch.

Thanks,
Xiang
On Mon, Mar 16, 2020 at 01:49:51PM +0000, Ori Kam wrote:
> Hi Wang,
> 
> PSB, if you don't have any objections and other comments, 
> I will start working on the class and will address all of this thread comments 
> in the v1 patch,
> 
> Thanks,
> Ori 
> 
> > -----Original Message-----
> > From: Wang Xiang <xiang.w.wang at intel.com>
> > Sent: Monday, March 16, 2020 10:48 PM
> > To: Ori Kam <orika at mellanox.com>
> > Cc: jerinj at marvell.com; dev at dpdk.org; pbhagavatula at marvell.com; Shahaf
> > Shuler <shahafs at mellanox.com>; hemant.agrawal at nxp.com; Opher Reviv
> > <opher at mellanox.com>; Alex Rosenbaum <alexr at mellanox.com>;
> > dovrat at marvell.com; pkapoor at marvell.com; nipun.gupta at nxp.com;
> > bruce.richardson at intel.com; yang.a.hong at intel.com; harry.chang at intel.com;
> > gu.jian1 at zte.com.cn; shanjiangh at chinatelecom.cn;
> > zhangy.yun at chinatelecom.cn; lixingfu at huachentel.com; wushuai at inspur.com;
> > yuyingxia at yxlink.com; fanchenggang at sunyainfo.com;
> > davidfgao at tencent.com; liuzhong1 at chinaunicom.cn;
> > zhaoyong11 at huawei.com; oc at yunify.com; jim at netgate.com;
> > hongjun.ni at intel.com; j.bromhead at titan-ic.com; deri at ntop.org;
> > fc at napatech.com; arthur.su at lionic.com; Thomas Monjalon
> > <thomas at monjalon.net>
> > Subject: Re: [RFC v6] regexdev: introduce regexdev subsystem
> > 
> > Hi Ori,
> > 
> > On Mon, Mar 16, 2020 at 09:09:06AM +0000, Ori Kam wrote:
> > > Hi Xiang,
> > >
> > > > -----Original Message-----
> > > > From: Wang Xiang <xiang.w.wang at intel.com>
> > > > Sent: Monday, March 16, 2020 3:26 AM
> > > > To: Ori Kam <orika at mellanox.com>
> > > > Cc: jerinj at marvell.com; dev at dpdk.org; pbhagavatula at marvell.com;
> > Shahaf
> > > > Shuler <shahafs at mellanox.com>; hemant.agrawal at nxp.com; Opher Reviv
> > > > <opher at mellanox.com>; Alex Rosenbaum <alexr at mellanox.com>;
> > > > dovrat at marvell.com; pkapoor at marvell.com; nipun.gupta at nxp.com;
> > > > bruce.richardson at intel.com; yang.a.hong at intel.com;
> > harry.chang at intel.com;
> > > > gu.jian1 at zte.com.cn; shanjiangh at chinatelecom.cn;
> > > > zhangy.yun at chinatelecom.cn; lixingfu at huachentel.com;
> > wushuai at inspur.com;
> > > > yuyingxia at yxlink.com; fanchenggang at sunyainfo.com;
> > > > davidfgao at tencent.com; liuzhong1 at chinaunicom.cn;
> > > > zhaoyong11 at huawei.com; oc at yunify.com; jim at netgate.com;
> > > > hongjun.ni at intel.com; j.bromhead at titan-ic.com; deri at ntop.org;
> > > > fc at napatech.com; arthur.su at lionic.com; Thomas Monjalon
> > > > <thomas at monjalon.net>
> > > > Subject: Re: [RFC v6] regexdev: introduce regexdev subsystem
> > > >
> > > > On Sun, Mar 15, 2020 at 10:05:53AM +0000, Ori Kam wrote:
> > > > Hi Ori,
> > > >
> > > > > Hi Xiang,
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Wang Xiang <xiang.w.wang at intel.com>
> > > > > > Sent: Friday, March 13, 2020 3:20 AM
> > > > > > To: Ori Kam <orika at mellanox.com>
> > > > > > Cc: jerinj at marvell.com; dev at dpdk.org; pbhagavatula at marvell.com;
> > > > Shahaf
> > > > > > Shuler <shahafs at mellanox.com>; hemant.agrawal at nxp.com; Opher
> > Reviv
> > > > > > <opher at mellanox.com>; Alex Rosenbaum <alexr at mellanox.com>;
> > > > > > dovrat at marvell.com; pkapoor at marvell.com; nipun.gupta at nxp.com;
> > > > > > bruce.richardson at intel.com; yang.a.hong at intel.com;
> > > > harry.chang at intel.com;
> > > > > > gu.jian1 at zte.com.cn; shanjiangh at chinatelecom.cn;
> > > > > > zhangy.yun at chinatelecom.cn; lixingfu at huachentel.com;
> > > > wushuai at inspur.com;
> > > > > > yuyingxia at yxlink.com; fanchenggang at sunyainfo.com;
> > > > > > davidfgao at tencent.com; liuzhong1 at chinaunicom.cn;
> > > > > > zhaoyong11 at huawei.com; oc at yunify.com; jim at netgate.com;
> > > > > > hongjun.ni at intel.com; j.bromhead at titan-ic.com; deri at ntop.org;
> > > > > > fc at napatech.com; arthur.su at lionic.com; Thomas Monjalon
> > > > > > <thomas at monjalon.net>
> > > > > > Subject: Re: [RFC v6] regexdev: introduce regexdev subsystem
> > > > > >
> > > > > > Hi Ori,
> > > > > >
> > > > > > Sorry for the late response as I am occupied by other works.
> > > > > > Two comments below to make the definitions compatible to Hyperscan.
> > > > > >
> > > > > > Thanks,
> > > > > > Xiang
> > > > > >
> > > > > > On Tue, Mar 10, 2020 at 10:32:33AM +0000, Ori Kam wrote:
> > > > > > > +#define RTE_REGEX_PCRE_RULE_MATCH_ALL_F (1ULL << 13)
> > > > > > > +/**< This flag marks that the results for the pattern that is being
> > > > compiled
> > > > > > > + * should include all possible matches.
> > > > > > > + * @see struct rte_regex_dev_info::rule_flags, struct
> > > > > > rte_regex_rule::rule_flags
> > > > > > > + */
> > > > > > > +
> > > > > > Can we change this flag to RTE_REGEX_DEV_CFG_MATCH_ALL since
> > > > Hyperscan
> > > > > > only supports
> > > > > > match all mode and users don't have to specify this flag per rule?
> > > > > >
> > > > >
> > > > > Sure, we can replace the RTE_REGEX_PCRE_RULE_MATCH_ALL_F with
> > > > > RTE_REGEX_DEV_CFG_MATCH_ALL, and add
> > > > RTE_REGEX_DEV_CAPA_SUPP_MATCH_ALL
> > > > >
> > > > Ack, thanks.
> > > > >
> > > > > > > + */
> > > > > > > +__rte_experimental
> > > > > > > +int
> > > > > > > +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info
> > > > > > *dev_info);
> > > > > > > +
> > > > > > > +/* Enumerates RegEx device configuration flags */
> > > > > > > +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> > > > > > > +/**< Cross buffer scan refers to the ability to be able to detect
> > > > > > > + * matches that occur across buffer boundaries, where the buffers
> > are
> > > > > > related
> > > > > > > + * to each other in some way. Enable this flag when to scan payload
> > size
> > > > > > > + * greater than struct rte_regex_dev_info::max_payload_size and/or
> > > > > > > + * matches can present across scan buffer boundaries.
> > > > > > > + *
> > > > > > > + * @see struct rte_regex_dev_info::max_payload_size
> > > > > > > + * @see struct rte_regex_dev_config::dev_cfg_flags,
> > > > > > rte_regex_dev_configure()
> > > > > > > + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> > > > > > > + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> > > > > > > + * @see RTE_REGEX_OPS_RSP_PMI_TOJ_F
> > > > > > > + */
> > > > > > > +
> > > > > > Can we add another flag
> > > > > > RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_FULL_F? In this case,
> > > > > > we only return full match for cross buffer scan without any partial result
> > > > and
> > > > > > without returning response flags such as RTE_REGEX_OPS_RSP_PMI_*.
> > > > >
> > > > > I think that it is good in any case to return a flag if the detection was
> > based on
> > > > > more than one buffer.
> > > > > So I don't really see the advantage of adding such a flag.
> > > > > As far as I understand in your case if the match started in previous buffer
> > and
> > > > ended
> > > > > in the current buffer then you will return also the flag of
> > > > RTE_REGEX_OPS_RSP_PMI_TOJ_F
> > > > > For my general knowledge, in your system if we have the following regex:
> > > > ABC
> > > > > In the first buffer we have xxxA size 4 and the second buffer is BCxx
> > > > > If I understand correctly for first buffer you will return no match found.
> > > > > For the second buffer you will return found and end offset will be equal to
> > 2
> > > > > Am I correct?
> > > > > Or you are going to return end offset 6 because it started from the
> > previous
> > > > buffer?
> > > > >
> > > > Hyperscan guarantees the same matching result regardless of the data is in
> > a
> > > > single
> > > > block or scattered to multiple blocks. So we'll return end offset 6 in this
> > case
> > > > without giving any flag indicating whether the match is started in previous
> > > > buffer
> > > > or current buffer.
> > >
> > > What will happen if the match was only in the second buffer? For example
> > > Like before the regex is ABC but now the first buffer is xxxx and the second
> > buffer
> > > is ABCx will the result be end offset 3 or 7?
> > > If the answer is 3 than I think the flag is important, in order to let the user
> > know
> > > that he should count from previous buffer.
> > > If the answer is 7, since only Hyperscan works with end offset if could be
> > defined
> > > that when working with end offset and cross buffer scan is supported then the
> > > result is always true result.
> > >
> > > So I think that RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_FULL_F is not
> > relevant in any
> > > case but the flag should be used if the offset returned is 3.
> > >
> > Hyperscan returns 7 in this case, so these flags aren't necessary.
> > 
> > Hyperscan works in two modes:
> > 1) return start and end offset
> > 2) return end offset
> > 
> > Since only Hyperscan supports RTE_REGEX_DEV_CFG_MATCH_ALL, we can
> > define
> > the result always true if match all and cross buffer scan are
> > configured. Having the scan full flag will make users better aware of
> > the difference from HW solutions. If you really don't want keep this flag,
> > please make this definition clear to users.
> 
> The issue with the new flag is that it should always be set, so it is redundant
> if I understand correctly. I will try to make it clearer in the comment.
> 
> > >
> > > In other related question, how do Hyperscan marks that 2 buffers should be
> > treated as one?
> > > I think you are missing the cross_buf_id that was introduced in V3 but was
> > removed due to
> > > lack of usage. This variable was designed to be used in order to let the RegEx
> > engine a place
> > > to save the engine state.
> > >
> > I agree, we need to have the cross_buf_id back to support cross buffer
> > scan.
> 
> I will re-add it.
> 
> > > > >
> > > > > Best,
> > > > > Ori
> > > > >
> > > >
> > > > Best,
> > > > Xiang
> > 
> > Thanks,
> > Xiang


More information about the dev mailing list