[dpdk-dev] [RFC v5] regexdev: introduce regexdev subsystem

Wang Xiang xiang.w.wang at intel.com
Mon Mar 2 08:05:00 CET 2020


Hi Ori,

Comments below.

Thanks,
Xiang

On Thu, Feb 27, 2020 at 03:08:35PM +0000, Ori Kam wrote:
> From: Jerin Jacob <jerinj at marvell.com>
> 
> Even though there are some vendors which offer Regex HW offload, due to
> lack of standard API, It is diffcult for DPDK consumer to use them
> in a portable way.
> 
> This _RFC_ attempts to standardize the RegEx/DPI offload APIs for DPDK.
> 
> This RFC crafted based on SW Regex API frameworks such as libpcre and
> hyperscan and a few of the RegEx HW IPs which I am aware of.
> 
> RegEx pattern matching applications:
> * Next Generation Firewalls (NGFW)
> * Deep Packet and Flow Inspection (DPI)
> * Intrusion Prevention Systems (IPS)
> * DDoS Mitigation
> * Network Monitoring
> * Data Loss Prevention (DLP)
> * Smart NICs
> * Grammar based content processing
> * URL, spam and adware filtering
> * Advanced auditing and policing of user/application security policies
> * Financial data mining - parsing of streamed financial feeds
> * Application recognition.
> * Dmemory introspection.
> * Natural Language Processing (NLP)
> * Sentiment Analysis.
> * Big data databse acceleration.
> * Computational storage.
> 
> Request to review from HW and SW RegEx vendors and RegEx application
> users to have portable DPDK API for RegEx.
> 
> The API schematics are based cryptodev, eventdev and ethdev existing
> device API.
> 
> Signed-off-by: Jerin Jacob <jerinj at marvell.com>
> Signed-off-by: Pavan Nikhilesh <pbhagavatula at marvell.com>
> Signed-off-by: Ori Kam <orika at mellanox.com>
> ---
> V5:
>  * Remove unused iov struct.
> V4:
>  * Replace iov with mbuf.
>  * Small ML comments.
> V3:
>  * Change subject title.
> V2:
>  * Address ML comments.
> ---
> +
> +#define RTE_REGEX_DEV_SUPP_PCRE_GREEDY_F (1ULL << 6)
> +/**< RegEx device support PCRE Greedy mode.
> + * For example if the RegEx is 'AB\d*?' then '*?' represents zero or unlimited
> + * matches. In greedy mode the pattern 'AB12345' will be matched completely
> + * where as the ungreedy mode 'AB' will be returned as the match.
> + * @see struct rte_regex_dev_info::regex_dev_capa
> + */
> +

Hyperscan actually supports "match all" semantic, neither greedy nor ungreedy,
which is different from PCRE. In the case above, AB, AB1, ..., AB12345 will all
be returned as matches. Do HW solutions support this?
Can we add a new flag like RTE_REGEX_DEV_SUPP_PCRE_MATCHALL_F?
Similarly, we can define a flag RTE_REGEX_PCRE_RULE_MATCHALL_F so Hyperscan 
users have to set this flag during rule compile.

> +#define RTE_REGEX_DEV_SUPP_PCRE_LOOKAROUND_ASRT_F (1ULL << 7)
> +/**< RegEx device support PCRE Lookaround assertions
> + * (Zero-width assertions). Example RegEx is '[a-z]+\d+(?=!{3,})' if
> + * the given pattern is 'dwad1234!' the RegEx engine doesn't report any matches
> + * because the assert '(?=!{3,})' fails. The pattern 'dwad123!!!' would return a
> + * successful match.
> + * @see struct rte_regex_dev_info::regex_dev_capa
> + */
> +
> +
> +/**
> + * RegEx device information
> + */
> +struct rte_regex_dev_info {
> +	const char *driver_name; /**< RegEx driver name. */
> +	struct rte_device *dev;	/**< Device information. */
> +	uint16_t max_matches;
> +	/**< Maximum matches per scan supported by this device. */
> +	uint16_t max_queue_pairs;
> +	/**< Maximum queue pairs supported by this device. */
> +	uint16_t max_payload_size;
> +	/**< Maximum payload size for a pattern match request or scan.
> +	 * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> +	 */
> +	uint32_t max_rules_per_group;
> +	/**< Maximum rules supported per group by this device. */
> +	uint16_t max_groups;
> +	/**< Maximum groups supported by this device. */
> +	uint32_t regex_dev_capa;
> +	/**< RegEx device capabilities. @see RTE_REGEX_DEV_CAPA_* */
> +	uint64_t rule_flags;
> +	/**< Supported compiler rule flags.
> +	 * @see RTE_REGEX_PCRE_RULE_*, struct rte_regex_rule::rule_flags
> +	 */
> +	uint8_t max_scatter_gather;
> +	/**< The max supported number of buffers that can
> +	 * be used in a single ops. The total size of all elements
> +	 * must be less then max_payload_size.
> +	 */

s/then/than

> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Retrieve the contextual information of a RegEx device.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + *
> + * @param[out] dev_info
> + *   A pointer to a structure of type *rte_regex_dev_info* to be filled with the
> + *   contextual information of the device.
> + *
> + * @return
> + *   - 0: Success, driver updates the contextual information of the RegEx device
> + *   - <0: Error code returned by the driver info get function.
> + *
> + */
> +__rte_experimental
> +int
> +rte_regex_dev_info_get(uint8_t dev_id, struct rte_regex_dev_info *dev_info);
> +
> +/* Enumerates RegEx device configuration flags */
> +#define RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F (1ULL << 0)
> +/**< Cross buffer scan refers to the ability to be able to detect
> + * matches that occur across buffer boundaries, where the buffers are related
> + * to each other in some way. Enable this flag when to scan payload size
> + * greater struct struct rte_regex_dev_info::max_payload_size and/or
> + * matches can present across scan buffer boundaries.

s/struct/than

> + *
> + * @see struct rte_regex_dev_info::max_payload_size
> + * @see struct rte_regex_dev_config::dev_cfg_flags, rte_regex_dev_configure()
> + * @see RTE_REGEX_OPS_RSP_PMI_SOJ_F
> + * @see RTE_REGEX_OPS_RSP_PMI_EOJ_F
> + */
> +
> +
> +/* Enumerates RegEx response flags. */
> +#define RTE_REGEX_OPS_RSP_PMI_SOJ_F (1 << 0)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * start of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +
Hyperscan supports cross buffer scan and only reports true matches instead of
partial matches. Can we have users to config this partial match capability?

> +#define RTE_REGEX_OPS_RSP_PMI_EOJ_F (1 << 1)
> +/**< Indicates that the RegEx device has encountered a partial match at the
> + * end of scan in the given buffer.
> + *
> + * @see RTE_REGEX_DEV_CFG_CROSS_BUFFER_SCAN_F
> + */
> +


More information about the dev mailing list