Hi, Bing, I am sorry for the late reply: >> >1.8.3.1 >> >> [Wang, Yipeng] >> >> Hi, Bing, >> >> Table resizing is very important and I think it is a critical feature to have >> in DPDK's hash library. Thanks a lot for putting effort on this and I hope >> we could work out a solution based on your patch. >> > >Thanks. To my understanding, yes it is very important. Especially, there is >some case that the maximum capacity is quite large but elements could >be just a few during one run time life cycle. > >> My major concern of the current implementation is as following: >> >> Although we still have the fbk hash table implementation, I think the >> goal here is to have the users to move to the main hash table >> implementation (the cuckoo algorithm), since it is the better one in >> most use cases. There have been many optimizations done on the >> current code base since the cuckoo hash was introduced, we do not >> want to reinvent the wheel such as the multi-threading support, non- >> TSO machine support, SIMD optimizations, etc. Also, comparing to >> linked list, the current bucketized table should provide better lookup >> performance. Last but not least, we probably don't want fragmented >> hash table APIs. >> > >Yes, I see and I agree. I have some basic study of the cuckoo and it is >quite a nice implementation from the paper to the engineering. And a >lot of optimizations both on X86_64 and aarch64 were done. And unique >solution will be much easier and less problematic for the users and >maintainers. > >> I would suggest to see if we could add resizing feature to the current >> hash table implementation first. >> For example, we can use the current bucketized design to do the >> resizing rather than the new linked list-based data structure. >> We do need some modifications such as having a local bucket_shift >> variable each bucket, but I think it is feasible. We could also turn off >> the cuckoo path/extendable linked list of the current implementation >> when resizing is needed, so that it would be simpler for the >> implementation. >> In such way, we keep the current API and also reuse many of the >> existing code. >> > >That will be quite great if the resize with cuckoo hash could be realized. >I am not quite familiar with the current status of the researching. Is there >any paper or ideas these years for the cuckoo hash algorithm already? > [Wang, Yipeng] I am not aware of new papers focusing on hash table resizing, but I encountered one resizing algorithm from this paper: "Elastic Sketch: Adaptive and Fast Network-wide Measurements". If you look at section 3.4, the authors first duplicates the hash table and gradually remove the duplicates. But I think the duplication would still be too costly.
I guess what I suggested is to disable cuckoo path while we resize. Just treat it as a regular hash table, then the resizing would be similar to your linked list algorithm right? You just redistribute the contents in the buckets rather than breaking down the linked list. >> Also, I wonder if you have a specific use case that benefit from the >> gradually break-down approach? Since we could always stop the world >> and resize the table at once. > >Do you mean the on-demand (when being accessed) split? Theoretically, >no for the total time, or even more because of the calculation and extra >checking. And this is only to smoothen the rate when there is already a >lot of elements in the table, e.g. 1M. Traversing 1M elements and relocate >them into the right position will be very time consuming. Especially the >memory is dynamically allocated but not continuous, the cache may be >also a problem with the linked list. I tested with a simple case and the >rate seemed to be smooth (no big jitter) - there is also some other actions >in the case and the hash action occupy about 15% of the CPU. >And this is only for the list case. I think when using cuckoo hash, maybe >there is no need of such trick 😊 [Wang, Yipeng] I guess I was thinking that If you have a specific use case, we could design the algorithm with the use case in mind. Thank you ! Yipeng