Hi Yipeng,

> -----Original Message-----
> From: Wang, Yipeng1
> Sent: Wednesday, September 27, 2017 6:40 PM
> To: dev@dpdk.org
> Cc: tho...@monjalon.net; Tai, Charlie <charlie....@intel.com>; Gobriel,
> Sameh <sameh.gobr...@intel.com>; De Lara Guarch, Pablo
> <pablo.de.lara.gua...@intel.com>; Mcnamara, John
> <john.mcnam...@intel.com>; Wang, Yipeng1 <yipeng1.w...@intel.com>
> Subject: [PATCH v4 1/7] member: implement main API
> 
> Membership library is an extension and generalization of a traditional filter
> (for example Bloom Filter) structure. In general, the Membership library is a
> data structure that provides a "set-summary" and responds to set-
> membership queries of whether a certain element belongs to a set(s). A
> membership test for an element will return the set this element belongs to
> or not-found if the element is never inserted into the set-summary.
> 
> The results of the membership test are not 100% accurate. Certain false
> positive or false negative probability could exist. However, comparing to a
> "full-blown" complete list of elements, a "set-summary"
> is memory efficient and fast on lookup.
> 
> This patch adds the main API definition.
> 
> Signed-off-by: Yipeng Wang <yipeng1.w...@intel.com>

A few comments on changes that you didn't make in the v4.

Thanks,
Pablo

...

> +
> +struct rte_member_setsum *
> +rte_member_create(const struct rte_member_parameters *params) {
> +     struct rte_tailq_entry *te;
> +     struct rte_member_list *member_list;
> +     struct rte_member_setsum *setsum;
> +     int ret;
> +
> +     if (params == NULL) {
> +             rte_errno = EINVAL;
> +             return NULL;
> +     }
> +
> +     if (params->key_len == 0 ||
> +                     params->prim_hash_seed == params-
> >sec_hash_seed) {
> +             rte_errno = EINVAL;
> +             RTE_MEMBER_LOG(ERR, "Memship create with invalid
> parameters\n");

Do not use "Memship". Change to " rte_member_create has invalid parameters"?
Or something else that you want, but not Memship.

> +             return NULL;
> +     }
> +

...

> +struct rte_member_parameters {
> +     const char *name;                       /**< Name of the hash. */
> +
> +     /**
> +      * User to specify the type of the setsummary from one of
> +      * rte_member_setsum_type.
> +      *
> +      * HT based setsummary is implemented like a hash table. User
> should use
> +      * this type when there are many sets.
> +      *
> +      * vBF setsummary is a vector of bloom filters. It is used when
> number
> +      * of sets is not big (less than 32 for current implementation).
> +      */
> +     enum rte_member_setsum_type type;
> +
> +     /**
> +      * If it is HT based setsummary, user to specify the subtype or mode
> +      * of the setsummary. It could be cache, or non-cache mode.
> +      * Set iscache to be 1 if to use as cache mode.

Change to "is_cache".

> +      *
> +      * For cache mode, keys can be evicted out of the HT setsummary.
> Keys
> +      * with the same signature and map to the same bucket
> +      * will overwrite each other in the setsummary table.
> +      * This mode is useful for the case that the set-summary only
> +      * needs to keep record of the recently inserted keys. Both
> +      * false-negative and false-positive could happen.
> +      *
> +      * For non-cache mode, keys cannot be evicted out of the cache. So
> for
> +      * this mode the setsummary will become full eventually. Keys with
> the
> +      * same signature but map to the same bucket will still occupy
> multiple
> +      * entries. This mode does not give false-negative result.
> +      */
> +     uint8_t is_cache;
> +
> +     /**
> +      * For HT setsummary, num_keys equals to the number of entries
> of the
> +      * table. When the number of keys that inserted in the HT
> setsummary

"number of keys inserted in the HT summary". Or "were inserted".

> +      * approaches this number, eviction could happen. For cache mode,
> +      * keys could be evicted out of the table. For non-cache mode, keys
> will

...

> +     /**
> +      * false_positive_rate is only relevant to vBF based setsummary.
> +      * false_positive_rate is the user-defined false positive rate
> +      * given expected number of inserted keys (num_keys). It is used to
> +      * calculate the total number of bits for each BF, and the number of
> +      * hash values used during lookup and insertion. For details please
> +      * refer to vBF implementation and membership library
> documentation.
> +      * Note that this parameter is not directly set by users for HT mode.
> +      *
> +      * HT setsummary's false positive rate is in the order of:
> +      * false_pos = (1/bucket_count)*(1/2^16), since we use 16-bit
> signature.
> +      * This is because two keys needs to map to same bucket and same
> +      * signature to have a collision (false positive). bucket_count is
> equal
> +      * to number of entries (num_keys) divided by entry count per
> bucket
> +      * (RTE_MEMBER_BUCKET_ENTRIES). Thus, the false_positive_rate
> is not
> +      * directly set by users.

Unless I understood it incorrectly, I think you should clarify that
this field is not set for HT mode, but it is for vBF, right?

> +      */
> +     float false_positive_rate;
> +
> +     /**
> +      * We use two seeds to calculate two independent hashes for each
> key.
> +      *
> +      * For HT type, one hash is used as signature, and the other is used
> +      * for bucket location.
> +      * For vBF type, these two hashes and their combinations are used
> as
> +      * hash locations to index the bit array.
> +      */
> +     uint32_t prim_hash_seed;
> +
> +     /**
> +      * The secondary seed should be a different value from the primary
> seed.
> +      */
> +     uint32_t sec_hash_seed;
> +
> +     int socket_id;                  /**< NUMA Socket ID for memory.
> */
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice
> + *
> + * Find an existing set-summary and return a pointer to it.
> + *
> + * @param name
> + *   Name of the set-summary.
> + * @return
> + *   Pointer to the set-summary or NULL if object not found
> + *   with rte_errno set appropriately. Possible rte_errno values include:
> + *    - ENOENT - value not available for return
> + */
> +struct rte_member_setsum *
> +rte_member_find_existing(const char *name);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice
> + *
> + * Create set-summary (SS).
> + *
> + * @param params
> + *   Parameters to initialize the setsummary.
> + * @return
> + *   Return the pointer to the setsummary.
> + *   Return value is NULL if the creation failed.
> + */
> +struct rte_member_setsum *
> +rte_member_create(const struct rte_member_parameters *params);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice
> + *
> + * Lookup key in set-summary (SS).
> + * Single key lookup and return as soon as the first match found
> + *
> + * @param setsum
> + *   Pointer of a setsummary.
> + * @param key
> + *   Pointer of the key to be looked up.
> + * @param set_id
> + *   Output the set id matches the key.
> + * @return
> + *   Return 1 for found a match and 0 for not found a match.
> + */
> +int
> +rte_member_lookup(const struct rte_member_setsum *setsum, const
> void *key,
> +                     member_set_t *set_id);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice
> + *
> + * lookup bulk of keys in set-summary (SS).

"Lookup bulk"

> + * Each key lookup returns as soon as the first match found
> + *
> + * @param setsum
> + *   Pointer of a setsummary.
> + * @param keys
> + *   Pointer of the bulk of keys to be looked up.
> + * @param num_keys
> + *   Number of keys that will be lookup.
> + * @param set_ids
> + *   Output set ids for all the keys to this array.
> + *   User should preallocate array that can contain all results, which size 
> is
> + *   the num_keys.
> + * @return

Reply via email to