[dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem

Jerin Jacob Kollanukkaran jerinj at marvell.com
Fri Sep 27 16:45:42 CEST 2019


> -----Original Message-----
> From: Jerin Jacob Kollanukkaran
> Sent: Tuesday, September 10, 2019 4:33 PM
> To: Shahaf Shuler <shahafs at mellanox.com>; Thomas Monjalon
> <thomas at monjalon.net>; dev at dpdk.org
> Cc: Pavan Nikhilesh Bhagavatula <pbhagavatula at marvell.com>; Hemant
> Agrawal <hemant.agrawal at nxp.com>; Opher Reviv <opher at mellanox.com>;
> Alex Rosenbaum <alexr at mellanox.com>; Dovrat Zifroni
> <dovrat at marvell.com>; Prasun Kapoor <pkapoor at marvell.com>; Nipun Gupta
> <nipun.gupta at nxp.com>; Wang, Xiang W <xiang.w.wang at intel.com>;
> Richardson, Bruce <bruce.richardson at intel.com>; yang.a.hong at intel.com;
> harry.chang at intel.com; gu.jian1 at zte.com.cn; shanjiangh at chinatelecom.cn;
> zhangy.yun at chinatelecom.cn; lixingfu at huachentel.com;
> wushuai at inspur.com; yuyingxia at yxlink.com; fanchenggang at sunyainfo.com;
> davidfgao at tencent.com; liuzhong1 at chinaunicom.cn;
> zhaoyong11 at huawei.com; oc at yunify.com; jim at netgate.com;
> hongjun.ni at intel.com; j.bromhead at titan-ic.com; deri at ntop.org;
> fc at napatech.com; arthur.su at lionic.com
> Subject: RE: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> subsystem
> 
> > Hi Jerin,
> 
> Hi Shahaf,
> 
> Sorry for delay in response(Was busy with 19.11 proposal deadline). Please see
> inline.
> 
> > >
> > > RegEx pattern matching applications:
> > > • Next Generation Firewalls (NGFW)
> > > • Deep Packet and Flow Inspection (DPI) • Intrusion Prevention
> > > Systems (IPS) • DDoS Mitigation • Network Monitoring • Data Loss
> > > Prevention (DLP) • Smart NICs • Grammar based content processing •
> > > URL, spam and adware filtering • Advanced auditing and policing of
> > > user/application security policies • Financial data mining - parsing
> > > of streamed financial feeds
> >
> > I think two more important use case to add (at least on the doc of
> > this
> > subsystem) are:
> > * application recognition
> > * memory introspection
> 
> Sure. Will add the following from John as well.
> 
> # Natural Language Processing (NLP)
> # Sentiment Analysis
> # Big Data database acceleration (Spark, Hadoop etc.) # Computational Storage
> 
> >
> >
> > > +/**
> > > + * Update the rule database of a RegEx device.
> > > + *
> > > + * @param dev_id RegEx device identifier
> > > + * @param rules
> > > + *   Points to an array of *nb_rules* objects of type *rte_regex_rule*
> > > structure
> > > + *   which contain the regex rules attributes to be updated in rule
> database.
> > > + * @param nb_rules
> > > + *   The number of PCRE rules to update the rule database.
> > > + *
> > > + * @return
> > > + *   The number of regex rules actually updated on the regex device's rule
> > > + *   database. The return value can be less than the value of the *nb_rules*
> > > + *   parameter when the regex devices fails to update the rule database or
> > > + *   if invalid parameters are specified in a *rte_regex_rule*.
> > > + *   If the return value is less than *nb_rules*, the remaining PCRE rules
> > > + *   at the end of *rules* are not consumed and the caller has to take
> > > + *   care of them and rte_errno is set accordingly.
> > > + *   Possible errno values include:
> > > + *   - -EINVAL:  Invalid device ID or rules is NULL
> > > + *   - -ENOTSUP: The last processed rule is not supported on this device.
> > > + *   - -ENOSPC: No space available in rule database.
> > > + *
> > > + * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()  */
> > > +uint16_t rte_regex_rule_db_update(uint8_t dev_id, const struct
> > > +rte_regex_rule
> > > *rules,
> > > +			 uint16_t nb_rules);
> >
> > I think the function name is not too informative. If this function
> > meant to compile the rule then it should be explicit on the function name.
> 
> It is meant to be compile the rules and then  update the rule database.
> 
> I think, we can have either 1 or 2. Let me know your preference or If you have
> any name suggestion. I will change it accordingly.
> 
> 1. rte_regex_rule_db_compile()
> 2. rte_regex_rule_db_compile_update()


@Shahaf Shuler, Thoughts?


> 
> 
> > > +
> > > + */
> > > +struct rte_regex_ops {
> > > +
> > > +	/* W4 */
> > > +	RTE_STD_C11
> > > +	union {
> > > +		uint64_t user_id;
> > > +		/**< Application specific opaque value. An application may
> > > use
> > > +		 * this field to hold application specific value to share
> > > +		 * between dequeue and enqueue operation.
> > > +		 * Implementation should not modify this field.
> > > +		 */
> > > +		void *user_ptr;
> > > +		/**< Pointer representation of *user_id* */
> > > +	};
> >
> > Since we target the regex subsystem for both regex and DPI I think it
> > will be good to add another uint64_t field called connection_id.
> > Device that support DPI can refer to it as another match able field
> > when looking up for matches on the given buffer.
> >
> > This field is different from the user_id, as it is not opaque for the device.
> 
> Is this driver specific storage place where application should not touch it?
> 
> If not, Could you share the data flow of this field? Ie. Who "write" this Field and
> who "read" this field.

@Shahaf Shuler Thoughts?

Based on your input, I will update the next version.

> 
> This is just for documentation, In any event we can add new fields.
> 
> If it is only for driver usage then I think, some driver may need more 8B
> Storage. In that case I think, each driver can add its on field After W4(i.e
> existing user_id) and introduce new field called match_offset in struct
> rte_regex_ops
> 
> ie. struct rte_regex_match *matches == ops + ops-> match_offset; so that, Each
> driver can add enough driver specific metadata.
> 
> 
> 



More information about the dev mailing list