On 07/31/2018 10:57 AM, Wiles, Keith wrote:
On Jul 31, 2018, at 1:09 AM, Fu, Qiaobin <qiaob...@bu.edu> wrote:
Hi Yipeng,
Thanks for the feedbacks!
On Jul 30, 2018, at 4:24 PM, Wang, Yipeng1 <yipeng1.w...@intel.com> wrote:
Hi, Qiaobin,
Thanks for the patch. If I understand correctly your use case is to use hash table as a
"cache" that new entries should evict stale ones automatically. Could you be
more specific on your use scenarios so that I can have more context?
Actually, we didn’t use hash table as a “cache”, and there is no automation to
evict stale ones. Instead, the functionrte_hash_bucket_iterate() allows the
callers to iterate over the hash table in an incremental way, i.e., one bucket
by another bucket, so that the caller doesn’t have to iterate over the whole
hash table in one time. This way can avoid creating hiccups and achieve better
cache performance. One possible use scenario is when DDoS attacks happen, one
may want to take more CPU time to process the packets, thus iterating over the
hash table incrementally is desirable.
I do not see this as a cache, but only a way to iterate over the hash table.
We are actually working on an extendable version of rte_hash which will link
extra buckets to current table if the table is full. As long as the table is
sized appropriately, there will be no conflict issues thus you don’t need to
replace old entries. Will this solve your issue? Also, if the “cache” mode is
what you want, we have the membership library “rte_member”. Is it applicable to
your case?
I agree that adding extra buckets to current table when the table is full can
alleviate this scenario. Indeed, we also considered this solution before coming
up our proposed solution. However, it’s still highly desirable to have a bucket
iterator function. Considering the scenario where the machine’s memory is
limited and cannot always allocate new memory to the hash table (e.g., DDoS
attacks, switch measurement tasks, etc.), a solution is to allow the callers
evict some less important (the criteria for the eviction is based on the
caller’s needs) entries.
Indeed, we don’t have the “cache” mode, our implementation is a way to achieve
better cache performance. So, the rte_member won’t help here.
w.r.t. the patch set you proposed, my concern is that the new API will expose
too much internals to the user, e.g. the bucket/entries structure should be
implementation dependent. Exposing internal structures would prevent improving
the implementation down the road unless we keep changing API which is not
recommended.
The functions we add here give a way for the callers to iterate over the
specific bucket, which potentially contains the entry. These functions can be
made general enough to allow callers to heuristically evict some entries, and
add the new ones to the hash table. Otherwise, there is no way to evict some
less important entries.
We have many other iterators in DPDK and this one is no different. If we can
hide the internals, then it would be good. The callback function should not
expose any internals only the value in the hash table, which I believe is do
just that, right?
The use case that led to this iterator is the following: a hash
table of flows is overloaded due to a DoS attack. The easy option is to
ignore new flows, but this is not optimal when there's information
pointing out that a given new flow is more important than one flow in
the bucket in which the new flow must be inserted. So the high-level
interpretation for this iterator is to find out which are the candidates
such that one must be dropped to add a new flow. That is why the patch
adds rte_hash_get_primary_bucket() and rte_hash_get_secondary_bucket().
We don't need more than these candidates because the memory lookups
would be overwhelming. In fact, we have found that the eight candidates
that rte_hash_get_primary_bucket() gives is already good enough.
Once we solved the problem above, we've noticed that with small
adjustments of the code, it would make it easy to scan a hash table for
stale entries one bucket at a time instead of an entry at a time (the
regular iterator). The idea is that once one reads the bucket location,
the information about the all entries is coming together into the
processor cache, so we can take advantage of the information that is
already there.
While the second feature is good to have, the first one is the
feature we must have in our software.
[ ]'s
Michel Machado