[dpdk-dev] [PATCH v3 1/7] member: implement main API
De Lara Guarch, Pablo
pablo.de.lara.guarch at intel.com
Mon Sep 25 16:15:03 CEST 2017
> -----Original Message-----
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Yipeng Wang
> Sent: Wednesday, September 6, 2017 1:00 AM
> To: dev at dpdk.org
> Cc: Tai, Charlie <charlie.tai at intel.com>; Gobriel, Sameh
> <sameh.gobriel at intel.com>; Wang, Ren <ren.wang at intel.com>; Wang,
> Yipeng1 <yipeng1.wang at intel.com>; Mcnamara, John
> <john.mcnamara at intel.com>
> Subject: [dpdk-dev] [PATCH v3 1/7] member: implement main API
>
> Membership library is an extension and generalization of a traditional filter
> (for example Bloom Filter) structure. In general, the Membership library is a
> data structure that provides a "set-summary" and responds to set-
> membership queries of whether a certain element belongs to a set(s). A
> membership test for an element will return the set this element belongs to
> or not-found if the element is never inserted into the set-summary.
>
> The results of the membership test is not 100% accurate. Certain false
Is -> are.
> positive or false negative probability could exist. However, comparing to a
> "full-blown" complete list of elements, a "set-summary"
> is memory efficient and fast on lookup.
>
> This patch add the main API definition.
>
> Signed-off-by: Yipeng Wang <yipeng1.wang at intel.com>
> ---
> lib/Makefile | 2 +
> lib/librte_eal/common/eal_common_log.c | 1 +
> lib/librte_eal/common/include/rte_log.h | 1 +
> lib/librte_member/Makefile | 49 +++
> lib/librte_member/rte_member.c | 342 ++++++++++++++++++++
> lib/librte_member/rte_member.h | 518
> +++++++++++++++++++++++++++++++
> lib/librte_member/rte_member_version.map | 16 +
> 7 files changed, 929 insertions(+)
> create mode 100644 lib/librte_member/Makefile create mode 100644
> lib/librte_member/rte_member.c create mode 100644
> lib/librte_member/rte_member.h create mode 100644
> lib/librte_member/rte_member_version.map
>
...
> diff --git a/lib/librte_member/rte_member.c
> b/lib/librte_member/rte_member.c new file mode 100644 index
> 0000000..71b066d
> --- /dev/null
> +++ b/lib/librte_member/rte_member.c
...
> +
> +#include <string.h>
> +
> +#include <rte_eal.h>
> +#include <rte_eal_memconfig.h>
> +#include <rte_memory.h>
> +#include <rte_memzone.h>
> +#include <rte_malloc.h>
> +#include <rte_errno.h>
> +
> +#include "rte_member.h"
> +#include "rte_member_ht.h"
> +#include "rte_member_vbf.h"
> +
> +TAILQ_HEAD(rte_member_list, rte_tailq_entry); static struct
> +rte_tailq_elem rte_member_tailq = {
> + .name = "RTE_MEMBER",
> +};
> +EAL_REGISTER_TAILQ(rte_member_tailq)
> +
> +
> +void *
> +rte_member_find_existing(const char *name) {
> + struct rte_member_setsum *setsum = NULL;
> + struct rte_tailq_entry *te;
> + struct rte_member_list *member_list;
> +
> + member_list = RTE_TAILQ_CAST(rte_member_tailq.head,
> rte_member_list);
> +
> + rte_rwlock_read_lock(RTE_EAL_TAILQ_RWLOCK);
> + TAILQ_FOREACH(te, member_list, next) {
> + setsum = (struct rte_member_setsum *) te->data;
> + if (strncmp(name, setsum->name,
> RTE_MEMBER_NAMESIZE) == 0)
> + break;
> + }
> + rte_rwlock_read_unlock(RTE_EAL_TAILQ_RWLOCK);
> +
> + if (te == NULL) {
> + rte_errno = ENOENT;
> + return NULL;
> + }
> + return setsum;
> +}
> +
> +void
> +rte_member_free(void *ss)
Why not using directly "struct rte_member_setsum", so you can use it directly?
This applies to other functions that are using "void *".
I see that the content of this structure is internal, but you can still use a pointer to that structure.
...
> +void *
> +rte_member_create(const struct rte_member_parameters *params) {
> + struct rte_tailq_entry *te;
> + struct rte_member_list *member_list = NULL;
> + struct rte_member_setsum *setsum = NULL;
> + int ret;
> +
> + if (params == NULL) {
> + rte_errno = EINVAL;
> + return NULL;
> + }
> +
> + if ((params->key_len == 0)) {
> + rte_errno = EINVAL;
> + RTE_LOG(ERR, MEMBER,
> + "Memship create with invalid parameters\n");
rte_member_create has invalid parameters?
> diff --git a/lib/librte_member/rte_member.h
> b/lib/librte_member/rte_member.h new file mode 100644 index
> 0000000..de44b1b
> --- /dev/null
> +++ b/lib/librte_member/rte_member.h
> @@ -0,0 +1,518 @@
...
> +enum rte_member_setsum_type {
> + RTE_MEMBER_TYPE_HT = 0, /**< Hash table based set summary.
> */
> + RTE_MEMBER_TYPE_VBF, /**< vector of bloom filters. */
"vector" -> "Vector".
> + RTE_MEMBER_NUM_TYPE
> +};
> +
> +/** @internal compare function for different arch. */ enum
> +rte_member_sig_compare_function {
> + RTE_MEMBER_COMPARE_SCALAR = 0,
> + RTE_MEMBER_COMPARE_AVX2,
> + RTE_MEMBER_COMPARE_NUM
> +};
> +
> +/** @internal setsummary structure. */
> +struct rte_member_setsum {
> + enum rte_member_setsum_type type;
> + const char *name;
> + uint32_t key_len;
> + uint32_t socket_id; /* NUMA Socket ID for memory. */
Either comment all the fields or do not do it (for this case, because the
structure is internal, otherwise, all fields would require a comment).
Also, remember that they should all start with capital letters.
> + uint32_t prim_hash_seed;
> + uint32_t sec_hash_seed;
> +
> +
> + /* hash table based */
> + uint32_t bucket_cnt;
> + uint32_t bucket_mask;
> + uint8_t cache;
> + enum rte_member_sig_compare_function sig_cmp_fn;
...
> +struct rte_member_parameters {
...
> + uint8_t iscache;
As said in the documentation patch, change to "is_cache".
> +
> + /**
> + * For HT setsummary, num_keys equals to the number of entries
> of the
> + * table. When the number of keys that inserted to the HT
> setsummary
"number of keys inserted in the HT summary".
> + * approaches this number, eviction could happen. For cache mode,
> + * keys could be evicted out of the table. For non-cache mode, keys
> will
> + * be evicted to other buckets like cuckoo hash. The table will also
> + * likely to become full before the number of inserted keys equal to
> the
> + * total number of entries.
> + *
> + * For vBF, num_keys equal to the expected number of keys that
> will
> + * be inserted into the vBF. The implementation assumes the keys
> are
> + * evenly distributed to each BF in vBF. This is used to calculate the
> + * number of bits we need for each BF. User does not specify the
> size of
> + * each BF directly because the optimal size depends on the
> num_keys
> + * and false positive rate.
> + */
> + uint32_t num_keys;
> +
> +
Leave just one blank line between fields.
> + /**
> + * The length of key is used for hash calculation. Since key is not
> + * stored in set-summary, large key does not require more memory
> space.
> + */
> + uint32_t key_len;
> +
> +
> + /**
> + * num_set is only relevant to vBF based setsummary.
> + * num_set is equal to the number of BFs in vBF. For current
> + * implementation, it only supports 1,2,4,8,16,32 BFs in one vBF set
> + * summary. If other number of sets are needed, for example 5, the
> user
> + * should allocate the minimum available value that larger than 5,
> + * which is 8.
> + */
> + uint32_t num_set;
> +
> + /**
> + * false_postive_rate is only relevant to vBF based setsummary.
> + * false_postivie_rate is the user-defined false positive rate
> + * given expected number of inserted keys (num_keys). It is used to
> + * calculate the total number of bits for each BF, and the number of
> + * hash values used during lookup and insertion. For details please
> + * refer to vBF implementation and membership library
> documentation.
> + *
> + * HT setsummary's false positive rate is in the order of:
> + * false_pos = (1/bucket_count)*(1/2^16), since we use 16-bit
> signature.
> + * This is because two keys needs to map to same bucket and same
> + * signature to have a collision (false positive). bucket_count is
> equal
> + * to number of entries (num_keys) divided by entry count per
> bucket
> + * (RTE_MEMBER_BUCKET_ENTRIES). Thus, the false_postivie_rate
> is not
> + * directly set by users.
> + */
Typo in "positive" in several places in the comment above.
Also, clarify that "false_positive_rate" is not directly set by users for HT mode.
> + float false_positive_rate;
...
> + * Lookup key in set-summary (SS).
> + * Single key lookup and return as soon as the first match found
Leave a blank line before parameters.
> + * @param setsum
> + * pointer of a setsummary
> + * @param key
> + * pointer of the key that needs to lookup
Better something like this "Pointer of the key to be looked up"
> + * @param set_id
> + * output the set id matches the key
> + * @return
> + * return 1 for found a match and 0 for not found a match
> + */
> +
Definitions of the fields should start with capital letter.
> +int
> +rte_member_lookup(const void *setsum,
> + const void *key, MEMBER_SET_TYPE *set_id);
> +
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice
> + *
> + * lookup bulk of keys in set-summary (SS).
"Lookup bulk"
> + * Each key lookup returns as soon as the first match found
> + * @param setsum
> + * Pointer of a setsummary
> + * @param keys
> + * Pointer of bulk of keys that to be lookup
"Pointer of bulk of keys to be looked up".
Check for this in other parts of the code.
> + * @param num_keys
> + * Number of keys that will be lookup
> + * @param set_ids
> + * Output set ids for all the keys to this array.
> + * User should preallocate array that can contain all results, which size is
> + * the num_keys.
> + * @return
> + * The number of keys that found a match
> + */
> +
> +
> +int
> +rte_member_lookup_bulk(const void *setsum,
> + const void **keys, uint32_t num_keys,
> + MEMBER_SET_TYPE *set_ids);
> +
> +
> +
> +
Too many blank spaces. Generally leave one between functions.
Check for this in the rest of the files.
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice
> + *
> + * Lookup a bulk of keys in set-summary (SS) for multiple matches each
> key.
> + * Each key lookup will find all matched entries (multiple match).
> + * Note that for cache mode HT, each key can have at most one match. So
> + * multi-match function is mainly used for vBF and non-cache mode HT.
> + * @param setsum
> + * pointer of a setsummary
> + * @param keys
> + * The keys that to be lookup
> + * @param num_keys
> + * The number of keys that will be lookup
> + * @param max_match_per_key
> + * The possible maximum number of matches for each key
> + * @param match_count
> + * Output the number of matches for each key in an array
> + * @param set_ids
> + * Return set ids for all the matches of all keys. User pass in a
> preallocated
"User passes in"
> + * 2D array with first dimension as key index and second dimension as
> match
> + * index. For example set_ids[bulk_size][max_match_per_key]
> + * @return
> + * The number of keys that found one or more matches in the set-
> summary
> + */
> +
> +int
> +rte_member_lookup_multi_bulk(const void *setsum,
> + const void **keys, uint32_t num_keys,
> + uint32_t max_match_per_key,
> + uint32_t *match_count,
> + MEMBER_SET_TYPE *set_ids);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice
> + *
> + * Insert key into set-summary (SS).
> + *
> + * @param setsum
> + * pointer of a set-summary
Start with capital letter.
> + * @param key
> + * the key that needs to be added
> + * @param set_id
> + * The set id associated with the key that needs to be added. Different
> mode
> + * supports different set_id ranges. 0 cannot be used as set_id since
> + * RTE_MEMBER_NO_MATCH by default is set as 0.
> + * For HT mode, the set_id has range as [1, 0x7FFF], MSB is reserved.
> + * For vBF mode the set id is limited by the num_set parameter when
> create
> + * the set-summary.
> + * @return
> + * HT (cache mode) and vBF should never fail unless the set_id is not in
> the
> + * valid range. In such case -EINVAL is returned.
> + * For HT (non-cache mode) it could fail with -ENOSPC error code when
> table is
> + * full.
> + * For success it returns different values for different modes to provide
> + * extra information for users.
> + * Return 0 for HT (cache mode) if the add does not cause
> + * eviction, return 1 otherwise. Return 0 for HT mode if success, -ENOSPC
For HT non-cache mode?
> for
> + * full, and 1 if cuckoo eviction happens. Always return 0 for vBF mode.
> + */
> +
> +int
> +rte_member_add(const void *setsum, const void *key,
> + MEMBER_SET_TYPE set_id);
> +
> +
...
> diff --git a/lib/librte_member/rte_member_version.map
> b/lib/librte_member/rte_member_version.map
> new file mode 100644
> index 0000000..a5877c9
> --- /dev/null
> +++ b/lib/librte_member/rte_member_version.map
> @@ -0,0 +1,16 @@
> +DPDK_17.11 {
> + global:
> +
> + rte_member_create;
> + rte_member_find_existing;
> + rte_member_lookup;
> + rte_member_lookup_bulk;
> + rte_member_lookup_multi;
> + rte_member_lookup_multi_bulk;
> + rte_member_add;
> + rte_member_free;
> + rte_member_reset;
> + rte_member_delete;
This list must be in alphabetical order.
> +
> + local: *;
> +};
> --
> 2.7.4
More information about the dev
mailing list