Hi, Bing,  I am sorry for the late reply:

>> >1.8.3.1
>>
>> [Wang, Yipeng]
>>
>> Hi, Bing,
>>
>> Table resizing is very important and I think it is a critical feature to have
>> in DPDK's hash library. Thanks a lot for putting effort on this and I hope
>> we could work out a solution based on your patch.
>>
>
>Thanks. To my understanding, yes it is very important. Especially, there is
>some case that the maximum capacity is quite large but elements could
>be just a few during one run time life cycle.
>
>> My major concern of the current implementation is as following:
>>
>> Although we still have the fbk hash table implementation, I think the
>> goal here is to have the users to move to the main hash table
>> implementation (the cuckoo algorithm), since it is the better one in
>> most use cases. There have been many optimizations done on the
>> current code base since the cuckoo hash was introduced, we do not
>> want to reinvent the wheel such as the multi-threading support, non-
>> TSO machine support, SIMD optimizations, etc. Also, comparing to
>> linked list, the current bucketized table should provide better lookup
>> performance. Last but not least, we probably don't want fragmented
>> hash table APIs.
>>
>
>Yes, I see and I agree. I have some basic study of the cuckoo and it is
>quite a nice implementation from the paper to the engineering. And a
>lot of optimizations both on X86_64 and aarch64 were done. And unique
>solution will be much easier and less problematic for the users and
>maintainers.
>
>> I would suggest to see if we could add resizing feature to the current
>> hash table implementation first.
>> For example, we can use the current bucketized design to do the
>> resizing rather than the new linked list-based data structure.
>> We do need some modifications such as having a local  bucket_shift
>> variable each bucket, but I think it is feasible. We could also turn off
>> the cuckoo path/extendable linked list of the current implementation
>> when resizing is needed, so that it would be simpler for the
>> implementation.
>> In such way, we keep the current API and also reuse many of the
>> existing code.
>>
>
>That will be quite great if the resize with cuckoo hash could be realized.
>I am not quite familiar with the current status of the researching. Is there
>any paper or ideas these years for the cuckoo hash algorithm already?
>
[Wang, Yipeng] 
I am not aware of new papers focusing on hash table resizing, but I encountered
one resizing algorithm from this paper: "Elastic Sketch: Adaptive and Fast 
Network-wide
Measurements". If you look at section 3.4, the authors first duplicates the 
hash table
and gradually remove the duplicates. But I think the duplication would still be 
too costly.

I guess what I suggested is to disable cuckoo path while we resize. Just treat
it as a regular hash table, then the resizing would be similar to your linked 
list algorithm right?
You just redistribute the contents in the buckets rather than breaking down the 
linked list.

>> Also, I wonder if you have a specific use case that benefit from the
>> gradually break-down approach? Since we could always stop the world
>> and resize the table at once.
>
>Do you mean the on-demand (when being accessed) split? Theoretically,
>no for the total time, or even more because of the calculation and extra
>checking. And this is only to smoothen the rate when there is already a
>lot of elements in the table, e.g. 1M. Traversing 1M elements and relocate
>them into the right position will be very time consuming. Especially the
>memory is dynamically allocated but not continuous, the cache may be
>also a problem with the linked list. I tested with a simple case and the
>rate seemed to be smooth (no big jitter) - there is also some other actions
>in the case and the hash action occupy about 15% of the CPU.
>And this is only for the list case. I think when using cuckoo hash, maybe
>there is no need of such trick 😊
[Wang, Yipeng] 
I guess I was thinking that If you have a specific use case,
we could design the algorithm with the use case in mind.

Thank you !
Yipeng

Reply via email to